Polymarket-v1 Database
Pith reviewed 2026-06-28 07:44 UTC · model grok-4.3
The pith
Ground-truth aggressor direction from the blockchain settlement layer shows that true VPIN predicts Brier scores while classified proxies do not.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The complete on-chain archive supplies 100 percent ground-truth aggressor direction unavailable in prior prediction-market data sets. Tick-rule and bulk-volume classifiers achieve only 49.83 percent and 50.51 percent aggregate accuracy, with systematic price-level bias arising from positive trade-direction autocorrelation and concentrated market-making. These errors cause inferred VPIN to diverge from ground-truth VPIN and bias OFI estimates. Ground-truth VPIN positively predicts Brier scores while Gibbs spread negatively predicts them, yet the same relationships are materially attenuated when ground-truth metrics are replaced by classified proxies.
What carries the argument
The 100 percent ground-truth aggressor direction extracted from the blockchain settlement layer, used both to benchmark classical classifiers and to validate microstructure metrics against subsequent forecast accuracy.
If this is right
- Classification errors propagate directly into VPIN and OFI, producing biased transaction-cost estimates.
- True VPIN rises with worse Brier scores, consistent with informed volume coinciding with poorer calibration.
- Gibbs spread falls with worse Brier scores, consistent with high-spread markets drawing informed specialists rather than noise traders.
- Any study that substitutes classified proxies for ground-truth metrics will understate the strength of microstructure-forecast linkages.
Where Pith is reading between the lines
- Prediction-market platforms may need classifiers explicitly adjusted for persistent direction autocorrelation rather than relying on equity-market defaults.
- The same ground-truth labels could be used to train market-specific classifiers that recover accurate VPIN at scale.
- On-chain settlement data from other decentralized prediction or betting venues would allow direct tests of whether the same classification failures appear outside Polymarket.
Load-bearing premise
The on-chain settlement layer supplies 100 percent accurate aggressor direction for every trade record without extraction errors or ambiguities.
What would settle it
Re-estimating the VPIN-Brier and Gibbs-Brier regressions on the same markets after replacing ground-truth labels with tick-rule labels at the observed error rate and finding that the slope coefficients remain statistically indistinguishable from the ground-truth results.
Figures
read the original abstract
We introduce the Polymarket-v1 Database: the complete on-chain trade archive of Polymarket's first-generation CTF Exchange on Polygon, spanning 2022-11-21 to 2026-04-28 and covering the full contract lifecycle from first settlement to natural termination. The dataset comprises 1.20 billion trade records across 1.30 million markets with $61 billion in nominal volume. Its defining feature is 100% ground-truth aggressor direction derived from the blockchain settlement layer, a property unavailable in existing prediction market archives, which rely on heuristic inference. We use this truth-aligned archive to benchmark standard microstructure tools and document three findings. First, the tick rule and bulk volume classification achieve near-random aggregate accuracy (49.83% and 50.51%), but this masks a systematic, correctable price-level gradient driven by positive trade direction autocorrelation and concentrated market-making -- two structural features of prediction markets that violate the mean-reversion assumption embedded in classical classifiers. Second, these classification errors propagate into downstream metrics: inferred VPIN diverges substantially from ground-truth VPIN, and OFI estimates are directionally biased, with material consequences for Transaction Cost Analysis. Third, ground-truth microstructure quality predicts forecasting performance in ways that classification-based proxies cannot recover: True VPIN positively predicts Brier scores, while Gibbs spread negatively predicts them -- a selection effect reflecting that high-spread niche markets attract informed specialists rather than noise traders. Replacing ground-truth metrics with classified proxies attenuates both relationships, illustrating that measurement accuracy at the transaction level is a prerequisite for reliable inference about prediction market design and probability calibration.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Polymarket-v1 Database: 1.20 billion on-chain trade records from Polymarket's CTF Exchange on Polygon (2022-11-21 to 2026-04-28) across 1.30 million markets and $61 billion nominal volume. Its core contribution is the provision of 100% ground-truth aggressor direction extracted from the blockchain settlement layer. Using this archive the authors benchmark the tick rule (49.83% accuracy) and bulk volume classification (50.51% accuracy), attribute the near-random performance to positive trade-direction autocorrelation and concentrated market-making, document propagation of classification errors into VPIN and OFI, and report that ground-truth VPIN positively and Gibbs spread negatively predict Brier scores while classification-based proxies attenuate both relationships.
Significance. If the ground-truth aggressor flags are verifiably error-free, the database supplies a large-scale, externally validated resource for prediction-market microstructure that is unavailable in existing archives. The documented divergences between inferred and true VPIN/OFI, together with the differential predictive power for Brier scores, would constitute concrete evidence that transaction-level direction accuracy is a prerequisite for reliable inference on forecasting performance and market design.
major comments (2)
- [Abstract] Abstract: the claim that the dataset supplies '100% ground-truth aggressor direction derived from the blockchain settlement layer' is load-bearing for every accuracy number, divergence result, and Brier-score relationship, yet the manuscript supplies no description of the extraction algorithm, handling of atomic multi-leg settlements, partial fills, or contract-event decoding ambiguities.
- [Abstract] Abstract: the statements that 'True VPIN positively predicts Brier scores, while Gibbs spread negatively predicts them' and that 'Replacing ground-truth metrics with classified proxies attenuates both relationships' are presented without reference to the underlying statistical specifications, sample construction, or robustness checks, preventing assessment of whether these selection-effect interpretations are supported by the data.
Simulated Author's Rebuttal
We thank the referee for the detailed reading and constructive comments on the abstract. Both points identify areas where additional clarity would strengthen the manuscript. We address each below and will revise accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the dataset supplies '100% ground-truth aggressor direction derived from the blockchain settlement layer' is load-bearing for every accuracy number, divergence result, and Brier-score relationship, yet the manuscript supplies no description of the extraction algorithm, handling of atomic multi-leg settlements, partial fills, or contract-event decoding ambiguities.
Authors: We agree that the abstract does not describe the extraction procedure. The full manuscript contains a methods section that specifies the on-chain event decoding logic, the treatment of atomic multi-leg settlements as single transactions, the identification of partial fills via cumulative fill events, and the resolution of contract-event ambiguities through the CTF settlement contract ABI. To make this transparent at the point of first reading, we will add a single sentence to the abstract summarizing the extraction approach and will include an explicit cross-reference to the methods section. revision: yes
-
Referee: [Abstract] Abstract: the statements that 'True VPIN positively predicts Brier scores, while Gibbs spread negatively predicts them' and that 'Replacing ground-truth metrics with classified proxies attenuates both relationships' are presented without reference to the underlying statistical specifications, sample construction, or robustness checks, preventing assessment of whether these selection-effect interpretations are supported by the data.
Authors: The abstract condenses results that are fully specified in the empirical section: market-day panel regressions of Brier score on VPIN and Gibbs spread with market-type fixed effects, volume controls, and robustness to alternative sample windows and winsorization. The attenuation result is shown via side-by-side coefficient comparisons. We will revise the abstract to include brief parenthetical references to the regression specification and sample definition, thereby directing readers to the supporting details without lengthening the abstract excessively. revision: yes
Circularity Check
No circularity; empirical database and benchmarks are self-contained
full rationale
The paper introduces an on-chain trade archive and uses its claimed ground-truth aggressor flags to benchmark tick-rule and bulk-volume classifiers, then reports divergences in VPIN/OFI and correlations between true microstructure metrics and Brier scores. No derivation chain reduces a claimed prediction or result to a fitted parameter or self-citation by construction; the central findings are direct empirical comparisons against an external data source rather than self-referential equations or renamed inputs. The work is a standard empirical contribution whose validity hinges on the accuracy of the blockchain extraction, not on internal definitional loops.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Blockchain settlement layer supplies 100% accurate aggressor direction for every trade
Reference graph
Works this paper leans on
-
[1]
Dubach, P. D. , title =. 2026 , howpublished =
2026
-
[2]
and Tsang, K
Yang, Z. and Tsang, K. P. , title =. 2026 , howpublished =
2026
-
[3]
Akey, P. and Gr. Who Wins and Who Loses in Prediction Markets? Evidence from Polymarket , year =
-
[4]
, title =
Slivkoff, N. , title =. 2025 , month =
2025
-
[5]
, title =
Becker, J. , title =. 2026 , howpublished =
2026
-
[6]
and Walther, M
Reichenbach, F. and Walther, M. , title =. 2026 , howpublished =
2026
-
[7]
and Al-Chami, J
Rahman, N. and Al-Chami, J. and Clark, J. , title =. 2025 , howpublished =
2025
-
[8]
and Zhou, L
Jia, H. and Zhou, L. and Zhang, W. and Cong, L. W. and Li, S. and Sun, S. , title =. 2026 , howpublished =
2026
-
[9]
and Ma, H
Sirolly, A. and Ma, H. and Kanoria, Y. and Sethi, R. , title =. 2025 , howpublished =
2025
-
[10]
and Zitzewitz, E
Wolfers, J. and Zitzewitz, E. , title =. Journal of Economic Perspectives , year =
-
[11]
and Wolfers, J
Snowberg, E. and Wolfers, J. , title =. Journal of Political Economy , year =
-
[12]
and Forsythe, R
Berg, J. and Forsythe, R. and Nelson, F. and Rietz, T. , title =. Handbook of Experimental Economics Results , editor =. 2008 , volume =
2008
-
[13]
, title =
Roll, R. , title =. The Journal of Finance , year =
-
[14]
Glosten, L. R. and Milgrom, P. R. , title =. Journal of Financial Economics , year =
-
[15]
Kyle, A. S. , title =. Econometrica , year =
-
[16]
, title =
Amihud, Y. , title =. Journal of Financial Markets , year =
-
[17]
Lee, C. M. C. and Ready, M. J. , title =. The Journal of Finance , year =
-
[18]
and O'Hara, M
Easley, D. and O'Hara, M. , title =. Journal of Financial Economics , year =
-
[19]
and Kiefer, N
Easley, D. and Kiefer, N. M. and O'Hara, M. and Paperman, J. B. , title =. The Journal of Finance , year =
-
[20]
Easley, D. and L. Flow Toxicity and Liquidity in a High Frequency World , journal =. 2012 , volume =
2012
-
[21]
, title =
Hasbrouck, J. , title =. The Journal of Finance , year =
-
[22]
Corwin, S. A. and Schultz, P. , title =. The Journal of Finance , year =
-
[23]
and Ranaldo, A
Abdi, F. and Ranaldo, A. , title =. The Review of Financial Studies , year =
-
[24]
and Granger, C
Gonzalo, J. and Granger, C. W. J. , title =. Journal of Business and Economic Statistics , year =
-
[25]
Lo, A. W. and MacKinlay, A. C. , title =. The Review of Financial Studies , year =
-
[26]
and Perron, P
Bai, J. and Perron, P. , title =. Econometrica , year =
-
[27]
and Perron, P
Bai, J. and Perron, P. , title =. Journal of Applied Econometrics , year =
-
[28]
, title =
Goodman-Bacon, A. , title =. Journal of Econometrics , year =
-
[29]
and Sant'Anna, P
Callaway, B. and Sant'Anna, P. H. C. , title =. Journal of Econometrics , year =
-
[30]
and D'Haultf
de Chaisemartin, C. and D'Haultf. Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects , journal =. 2020 , volume =
2020
-
[31]
Fama, E. F. , title =. The Journal of Finance , year =
-
[32]
, title =
O'Hara, M. , title =. 1995 , publisher =
1995
-
[33]
Grossman, S. J. and Stiglitz, J. E. , title =. American Economic Review , year =
-
[34]
Barclay, M. J. and Warner, J. B. , title =. Journal of Financial Economics , year =
-
[35]
Cong, L. W. and He, Z. and Li, J. and Tang, K. , title =. Management Science , year =
-
[36]
and Cramton, P
Budish, E. and Cramton, P. and Shim, J. , title =. The Quarterly Journal of Economics , year =
-
[37]
Abdi, F. and A. Ranaldo. 2017. A Simple Estimation of Bid-Ask Spreads from Daily Close, High, and Low Prices. The Review of Financial Studies 30 (12): 4437--4480
2017
-
[38]
Gr \'e goire, N
Akey, P., V. Gr \'e goire, N. Harvie, and C. Martineau. 2026. Who Wins and Who Loses in Prediction Markets? Evidence from Polymarket. SSRN 6443103. https://ssrn.com/abstract=6443103
2026
-
[39]
Amihud, Y. 2002. Illiquidity and Stock Returns: Cross-Section and Time-Series Effects. Journal of Financial Markets 5 (1): 31--56
2002
-
[40]
Bai, J. and P. Perron. 1998. Estimating and Testing Linear Models with Multiple Structural Changes. Econometrica 66 (1): 47--78
1998
-
[41]
Bai, J. and P. Perron. 2003. Computation and Analysis of Multiple Structural Change Models. Journal of Applied Econometrics 18 (1): 1--22
2003
-
[42]
Barclay, M. J. and J. B. Warner. 1993. Stealth Trading and Volatility: Which Trades Move Prices? Journal of Financial Economics 34 (3): 281--305
1993
-
[43]
Forsythe, F
Berg, J., R. Forsythe, F. Nelson, and T. Rietz. 2008. Results from a Dozen Years of Election Futures Markets Research. In Handbook of Experimental Economics Results, vol. 1, edited by C. Plott and V. Smith, pp. 742--751. Elsevier
2008
-
[44]
Callaway, B. and P. H. C. Sant'Anna. 2021. Difference-in-Differences with Multiple Time Periods. Journal of Econometrics 225 (2): 200--230
2021
-
[45]
Corwin, S. A. and P. Schultz. 2012. A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices. The Journal of Finance 67 (2): 719--760
2012
-
[46]
de Chaisemartin, C. and X. D'Haultf uille. 2020. Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. American Economic Review 110 (9): 2964--2996
2020
-
[47]
Dubach, P. D. 2026. The Anatomy of a Decentralized Prediction Market: Microstructure Evidence from the Polymarket Order Book. arXiv preprint arXiv:2604.24366. https://arxiv.org/abs/2604.24366
Pith/arXiv arXiv 2026
-
[48]
Easley, D., N. M. Kiefer, M. O'Hara, and J. B. Paperman. 1996. Liquidity, Information, and Infrequently Traded Stocks. The Journal of Finance 51 (4): 1405--1436
1996
-
[49]
Easley, D., M. M. L \'o pez de Prado, and M. O'Hara. 2012. Flow Toxicity and Liquidity in a High Frequency World. Review of Financial Studies 25 (5): 1457--1493
2012
-
[50]
Easley, D. and M. O'Hara. 1987. Price, Trade Size, and Information in Securities Markets. Journal of Financial Economics 19 (1): 69--90
1987
-
[51]
Fama, E. F. 1970. Efficient Capital Markets: A Review of Empirical Work. The Journal of Finance 25 (2): 383--417
1970
-
[52]
Glosten, L. R. and P. R. Milgrom. 1985. Bid, Ask, and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders. Journal of Financial Economics 14 (1): 71--100
1985
-
[53]
Goodman-Bacon, A. 2021. Difference-in-Differences with Variation in Treatment Timing. Journal of Econometrics 225 (2): 254--277
2021
-
[54]
Grossman, S. J. and J. E. Stiglitz. 1980. On the Impossibility of Informationally Efficient Markets. American Economic Review 70 (3): 393--408
1980
-
[55]
Hasbrouck, J. 1991. Measuring the Information Content of Stock Trades. The Journal of Finance 46 (1): 179--207
1991
-
[56]
Hasbrouck, J. 2009. Trading Costs and Returns for U.S. Equities: Estimating Effective Costs from Daily Data. The Journal of Finance 64 (3): 1445--1477
2009
-
[57]
Jia, H., L. Zhou, W. Zhang, L. W. Cong, S. Li, and S. Sun. 2026. Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: Experiments & Analysis. arXiv preprint arXiv:2604.20421. https://arxiv.org/abs/2604.20421
Pith/arXiv arXiv 2026
-
[58]
Kyle, A. S. 1985. Continuous Auctions and Insider Trading. Econometrica 53 (6): 1315--1335
1985
-
[59]
Lee, C. M. C. and M. J. Ready. 1991. Inferring Trade Direction from Intraday Data. The Journal of Finance 46 (2): 733--746
1991
-
[60]
Lo, A. W. and A. C. MacKinlay. 1988. Stock Market Prices Do Not Follow Random Walks: Evidence from a Simple Specification Test. The Review of Financial Studies 1 (1): 41--66
1988
-
[61]
O'Hara, M. 1995. Market Microstructure Theory. Cambridge, MA: Blackwell Publishers
1995
-
[62]
Rahman, N., J. Al-Chami, and J. Clark. 2025. SoK: Market Microstructure for Decentralized Prediction Markets (DePMs). arXiv preprint arXiv:2510.15612. https://arxiv.org/abs/2510.15612
arXiv 2025
-
[63]
Reichenbach, F. and M. Walther. 2026. Exploring Decentralized Prediction Markets: Accuracy, Skill, and Bias on Polymarket. SSRN 5910522. https://ssrn.com/abstract=5910522
2026
-
[64]
Roll, R. 1984. A Simple Implicit Measure of the Effective Bid-Ask Spread in an Efficient Market. The Journal of Finance 39 (4): 1127--1139
1984
-
[65]
Sirolly, A., H. Ma, Y. Kanoria, and R. Sethi. 2025. Network-Based Detection of Wash Trading. SSRN 5714122. https://ssrn.com/abstract=5714122
2025
-
[66]
Slivkoff, N. 2025. Polymarket Volume Is Being Double-Counted. Paradigm Research Note
2025
-
[67]
Snowberg, E. and J. Wolfers. 2010. Explaining the Favorite--Longshot Bias: Is It Risk-Love or Misperceptions? Journal of Political Economy 118 (4): 723--746
2010
-
[68]
Wolfers, J. and E. Zitzewitz. 2004. Prediction Markets. Journal of Economic Perspectives 18 (2): 107--126
2004
-
[69]
Yang, Z. and K. P. Tsang. 2026. The Anatomy of a Blockchain Prediction Market: Polymarket in the 2024 U.S. Presidential Election. arXiv preprint arXiv:2603.03136. https://arxiv.org/abs/2603.03136. SSRN 6336679
Pith/arXiv arXiv 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.