What Prediction Markets Can See: Market Formation, Settlement Legibility, and the Geography of Tradable Uncertainty in Africa and Latin America

Ade Adegbenro

arxiv: 2606.17503 · v1 · pith:IQAV2AP3new · submitted 2026-06-13 · 💰 econ.GN · q-fin.EC

What Prediction Markets Can See: Market Formation, Settlement Legibility, and the Geography of Tradable Uncertainty in Africa and Latin America

Ade Adegbenro This is my paper

Pith reviewed 2026-06-27 04:35 UTC · model grok-4.3

classification 💰 econ.GN q-fin.EC

keywords prediction marketsmarket formationsettlement legibilitytradable uncertaintyAfricaLatin AmericaPolymarketKalshi

0 comments

The pith

Prediction market contracts form only where platforms can credibly settle outcomes, so inventories reflect settlement rules as much as trader beliefs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the institutional conditions that determine which uncertainties become tradable contracts on platforms like Polymarket and Kalshi, rather than evaluating prices after contracts exist. It builds a coded measure of settlement legibility that scores how easily an event can be worded, sourced, and resolved by third parties, then applies it to 6,047 Africa- and Latin America-topic contracts. The data show formation is selective: African inventory clusters in football while civic events largely stay out, Latin American inventory centers on Venezuela, and legibility ranks sports and elections high while conflicts rank low. Among listed contracts, higher legibility links to lower trading value, and a test against 131 external civic events finds legibility predicts listing in the expected direction but not strongly enough to meet pre-set criteria. This leads to the claim that market inventories measure what can be settled at least as much as what the public cares about.

Core claim

Using an audited dataset of 6,047 contracts, the authors construct and validate a settlement legibility measure that orders contract formation, with sports and elections near the top and conflict at the bottom; legibility predicts listing in the expected direction against an external frame of 131 civic events but falls short of acceptance criteria, while among listed contracts the relation between legibility and trading value is negative.

What carries the argument

The coded settlement legibility measure, which scores the degree to which an uncertainty can be worded, sourced, and credibly resolved by third parties.

If this is right

African prediction market inventory concentrates overwhelmingly in football while salient civic events produce little or no inventory.
Latin American inventory is deeper but dominated by Venezuela, where attention to prospective United States military action sustains the largest civic cluster.
Legibility orders the inventory steeply, with sports and elections near the top of the scale and conflict at the bottom.
Among listed contracts, the relation between legibility and trading value is negative.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If settlement legibility limits contract formation, then prediction markets may systematically under-represent uncertainties in regions with weaker data or institutional infrastructure.
Platforms could expand coverage of important but low-legibility events by investing in better resolution sources rather than waiting for natural improvements in data availability.
Treating market volumes or prices as direct maps of public interest risks conflating platform capabilities with trader attention, especially for topics outside high-legibility domains like sports and elections.

Load-bearing premise

The settlement legibility measure, validated on a sample of 451 contracts, accurately captures the institutional constraints that shape formation across the full set of 6,047 contracts.

What would settle it

Finding many high-legibility civic events in Africa or Latin America that remain unlisted on both platforms, or many low-legibility conflict events that receive contracts and high trading volume, would challenge the claim that legibility orders formation.

read the original abstract

Prediction markets are usually evaluated after their contracts exist, by asking how well prices forecast outcomes. We study the prior institutional margin of market formation, asking which uncertainties become tradable contracts at all. Using an audited dataset of 6,047 Africa-topic and Latin America-topic contracts listed on Polymarket and Kalshi, we construct a coded measure of settlement legibility, the degree to which an uncertainty can be worded, sourced, and credibly resolved by third parties, and validate it on 451 units under a frozen codebook, where independent double scoring reaches ordinal reliabilities of 0.92 and 0.96 on the primary dimensions and blind human benchmarks reach 0.97 and 0.92. Using this measure, we find that formation is selective in ways that public importance does not explain, with African inventory concentrated overwhelmingly in football while salient civic events produce little or no inventory, and Latin American inventory deeper but dominated by Venezuela, where attention to prospective United States military action sustains the largest civic cluster in the data. Legibility orders the inventory steeply, with sports and elections near the top of the scale and conflict at the bottom. In a formation test against an externally assembled frame of 131 civic events, legibility predicts listing in the expected direction but falls short of pre-specified acceptance criteria, while among listed contracts the relation between legibility and trading value is negative, as a model of selective listing implies and as we predicted before estimation. Prediction-market inventories therefore measure what platforms can settle as much as what traders believe, and reading them as maps of public interest conflates the two.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows settlement legibility shapes which uncertainties get listed on these platforms, with decent validation work but unclear sampling for the full inventory.

read the letter

This paper's main point is that prediction market formation in Africa and Latin America tracks how easily an event can be worded and resolved by third parties, not just how much people care about it.

The new element is the settlement legibility framing itself, plus the application to a large set of contracts from Polymarket and Kalshi on these two regions. They coded the measure, validated it on 451 units with solid inter-rater numbers, and ran a pre-specified test against an external list of civic events. The negative link between legibility and trading value among listed contracts matches what their selective-listing model predicts.

The soft spot is the step from the 451 validated units to the full 6,047 contracts. The abstract gives no detail on how those 451 were chosen, so if they were the easier or more obvious cases the measure may not apply evenly. The formation test also fell short of the pre-specified bar, which weakens the claim that legibility orders what gets listed. The patterns in football and Venezuela are clear in the data but rest on the assumption that public importance would have produced more civic contracts if legibility were not the constraint.

Anyone using prediction market prices as belief measures in development or political economy work would get value from this, especially if they already worry about platform selection effects. It deserves a serious referee because the core idea is distinct from standard accuracy studies and the data collection is real effort, though the sampling description and test results need tightening before publication.

Referee Report

2 major / 2 minor

Summary. The paper claims that prediction-market contract formation on Polymarket and Kalshi for Africa- and Latin America-topic uncertainties is driven primarily by settlement legibility—the degree to which an event can be worded, sourced, and credibly resolved by third parties—rather than by public importance or trader beliefs alone. Using an audited inventory of 6,047 contracts, a coded legibility measure validated on 451 units (ordinal reliabilities 0.92/0.96, human benchmarks 0.97/0.92), the authors show steep ordering by legibility (sports/elections high, conflict low), selective formation (football concentration, Venezuela dominance), a formation test on 131 civic events that predicts in the expected direction but falls short of pre-specified criteria, and a negative legibility–trading-value relation among listed contracts, as predicted by selective-listing logic.

Significance. If the legibility measure generalizes, the result supplies a falsifiable account of why PM inventories are not neutral maps of uncertainty or attention; the pre-specified test, the negative value relation, and the audited dataset constitute concrete strengths that allow direct evaluation of the central claim.

major comments (2)

[Abstract / Validation] Abstract and validation description: the 451-unit validation sample is reported to have been drawn from the 6,047-contract inventory under a frozen codebook, yet no sampling frame, stratification, or randomness check is provided. Because the measure is then used to explain selectivity across the full inventory (football concentration, Venezuela cluster, negative value relation), the absence of sampling information directly undermines the claim that legibility, rather than unmeasured topic- or platform-specific factors, accounts for observed patterns.
[Formation test] Formation test paragraph: the test against the externally assembled 131-event civic frame is described as falling short of pre-specified acceptance criteria while still showing the expected directional relation. Given that this test was intended to validate the measure’s predictive power for listing, the shortfall requires either a revised acceptance threshold with justification or additional robustness checks (e.g., alternative frames or power analysis) before the result can be treated as supportive of the legibility account.

minor comments (2)

[Abstract] The abstract states ordinal reliabilities of 0.92 and 0.96 on the primary dimensions but does not indicate whether these apply to the full multi-item scale or to individual components; a supplementary table listing dimension-level statistics would improve transparency.
[Data construction] No explicit statement appears on how the 6,047 contracts were audited or on inclusion/exclusion rules for the Africa/Latin America topic filters; a short methods subsection on data construction would aid replicability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [Abstract / Validation] Abstract and validation description: the 451-unit validation sample is reported to have been drawn from the 6,047-contract inventory under a frozen codebook, yet no sampling frame, stratification, or randomness check is provided. Because the measure is then used to explain selectivity across the full inventory (football concentration, Venezuela cluster, negative value relation), the absence of sampling information directly undermines the claim that legibility, rather than unmeasured topic- or platform-specific factors, accounts for observed patterns.

Authors: We agree that the sampling procedure for the 451-unit validation sample requires fuller documentation. The sample was drawn from the audited 6,047-contract inventory after the codebook was frozen, with the explicit goal of covering the observed range of topics and platforms. We will revise the methods section to specify the sampling frame, any stratification by topic category (sports, elections, conflict, other), stratum sizes, and the post-selection check confirming randomness within strata. This addition will directly address concerns about representativeness and allow readers to evaluate whether unmeasured factors could confound the legibility patterns observed on the full inventory. revision: yes
Referee: [Formation test] Formation test paragraph: the test against the externally assembled 131-event civic frame is described as falling short of pre-specified acceptance criteria while still showing the expected directional relation. Given that this test was intended to validate the measure’s predictive power for listing, the shortfall requires either a revised acceptance threshold with justification or additional robustness checks (e.g., alternative frames or power analysis) before the result can be treated as supportive of the legibility account.

Authors: The manuscript already states that the formation test fell short of the pre-specified criteria while showing the expected direction. We will add two elements in revision: (1) a power analysis confirming that the 131-event sample has adequate power to detect the observed effect size, and (2) robustness checks that repeat the test on an alternative civic-event frame drawn from independent news archives. We will also provide a brief justification for retaining the directional result as supportive evidence when interpreted alongside the pre-specified negative legibility–trading-value relation among listed contracts. These changes will strengthen the evidential basis without altering the reported shortfall on the original criteria. revision: partial

Circularity Check

0 steps flagged

No significant circularity; measure constructed and validated independently with pre-specified predictions

full rationale

The paper constructs a coded settlement legibility measure, validates it independently on a 451-unit subsample with reported reliabilities of 0.92/0.96 and human benchmarks of 0.97/0.92 under a frozen codebook, then applies the measure to the full 6,047-contract inventory. The formation test against an external 131-event frame and the negative legibility-trading value relation among listed contracts are explicitly described as pre-specified predictions rather than post-estimation results. No equations, self-citations, fitted parameters renamed as predictions, or self-definitional steps appear in the abstract or described derivation. The central claim that inventories reflect settlement constraints follows from these independent empirical patterns without reducing to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review; the central claim rests on the validity and external applicability of the settlement legibility coding scheme and on the assumption that the Polymarket and Kalshi inventories are representative of tradable uncertainty in the two regions.

invented entities (1)

settlement legibility no independent evidence
purpose: Coded measure of the degree to which an uncertainty can be worded, sourced, and credibly resolved by third parties
New construct introduced to explain selective contract formation

pith-pipeline@v0.9.1-grok · 5830 in / 1214 out tokens · 40024 ms · 2026-06-27T04:35:08.486412+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 19 canonical work pages

[1]

4.2 Sports versus non-sports The sports/non-sports split gives the cleanest descriptive contrast

c activity is sports-led, while Latin America-topic activity is non-sports-led, with Venezuela and elections carrying much of the observed value. 4.2 Sports versus non-sports The sports/non-sports split gives the cleanest descriptive contrast. It separates a highly standardized class of contracts from the civic and economic categories that motivate the fo...

2024
[2]

In the dated portion of the sample, African non-sports observed value grows from $0.26 million in 2023 to $7.94 million in

The 777 analytical-sample contracts without creation timestamps do not generate the result, since they carry $12.6 million in Africa observed value and $112.3 million in Latin America observed value. In the dated portion of the sample, African non-sports observed value grows from $0.26 million in 2023 to $7.94 million in

2023
[3]

Africa sports, Latin America politics

The African civic pattern is therefore thin but rising, not static absence. 4.3 Africa and AFCON The Africa sports result is concentrated in a specific repeated tournament class. AFCON matters analytically because it offers the combination of named teams, scheduled fixtures, official scores, and repeatable contract templates that lowers listing and settle...

2023
[4]

who will win the next election in country c?

D3 (Closure precision). Could a trader mark the resolution moment on a calendar in advance? D3 is scored from a decision tree that distinguishes scheduled moments, bounded windows, and open-ended triggers, with an explicit branch for institutions whose calendars exist on paper but not in practice. Our primary legibility score is the sum of the first two d...

1998
[5]

sports" or

The gradient documented below is this logic applied 451 times under blind conditions, not a judgment about which events matter. 5.2 Coding protocol and reliability The scientific content of the instrument lies in the codebook, and the codebook is authored, frozen, and human-owned. We wrote the construct definition, the anchors for each dimension, and the ...

2023
[6]

The conventional floor for drawing tentative conclusions is α ≥ 0.67; we adopted this floor in advance for every reliability claim we make

Human benchmark, stage two Fresh 24-unit blind human revalidation of D3 D3 passes We measure reliability with Krippendorff's alpha for ordinal data, α = 1 − D_o / D_e, (2) where D_o is the observed mean squared ordinal distance between paired codes and D_e is the distance expected under chance assignment from the empirical marginal distribution (Krippendo...

2004
[7]

A test of formation requires a denominator of events that could have become markets but did not necessarily do so

A formation test against an external event frame The contract inventory alone cannot identify formation because it contains only successful listings. A test of formation requires a denominator of events that could have become markets but did not necessarily do so. We therefore construct an external frame of salient civic events in Africa and Latin America...

2022
[8]

Elections and referenda enter from the IFES ElectionGuide calendar

It is built from sources external to the platforms and frozen, with written selection rules, before matching to the contract data. Elections and referenda enter from the IFES ElectionGuide calendar. Every national election and national referendum in the 85-country universe with an event day inside the window is included, aggregated to the country-day occa...

2022
[9]

= Λ( β₀ + β₁ L_e + β₂ ln S_e + β₃ LatAm_e ), (5) where F_e indicates that event e matched at least one contract, L_e is the two-dimension legibility score, S_e is the salience proxy in (3), LatAm_e indicates a Latin America event, and Λ is the logistic function. We pre-specified a Fisher exact test on a high-low legibility split as the fallback under sepa...

2023
[10]

First, we do not observe trader location

Limitations Six limitations delimit the interpretation. First, we do not observe trader location. Topic geography cannot be interpreted as participant geography or local demand. Second, observed trade-derived value is incomplete. It is based on collected trade records where available, not complete order-book liquidity. Missing trade-derived value is uneve...

2021
[11]

Source fetches run from November 2025 to February

2025
[12]

We study that formation layer for Africa-topic and Latin America-topic contracts on Polymarket and Kalshi

Conclusion Prediction markets reveal platform formation before they reveal trader beliefs. We study that formation layer for Africa-topic and Latin America-topic contracts on Polymarket and Kalshi. The audited sample shows that Africa is present but selectively visible, with observed value dominated by football and especially by AFCON, while Latin America...

2024
[13]

Two AI coders from different model families coded independently on blinded, shuffled inputs containing only the unit id, question or event description, and resolution source

D.2 Coding protocol and reliability The production coding covered 451 units: 320 contracts and 131 external-frame events. Two AI coders from different model families coded independently on blinded, shuffled inputs containing only the unit id, question or event description, and resolution source. Sector, salience, notional value, and outcome fields were st...

2004
[14]

Salience is complete for all 131 events and is measured from English Wikipedia country-page views

It combines 114 IFES ElectionGuide national election or referendum occasions with 17 stratum-B events: 12 coup attempts, 3 armed-conflict onsets under the UCDP episode-start rule, and 2 irregular leadership exits of chief executives. Salience is complete for all 131 events and is measured from English Wikipedia country-page views. The boundary rules were ...

2022
[15]

Since κ − a′ − b < κ − a − b and (ln Ḡ)′ is non-increasing, this derivative is non-positive: the ratio is non-increasing in b

∝ Ḡ(κ − a′ − b) / Ḡ(κ − a − b) has logarithmic derivative in b equal to (ln Ḡ)′(κ − a − b) − (ln Ḡ)′(κ − a′ − b). Since κ − a′ − b < κ − a − b and (ln Ḡ)′ is non-increasing, this derivative is non-positive: the ratio is non-increasing in b. The conditional law of b given a, F = 1 is therefore decreasing in a in the likelihood-ratio order, which implies fi...

1979
[16]

Agreement that merely reflects both coders using a dominant category is thus credited as chance, not as reliability. This is the property that separates alpha from raw percent agreement: two coders who both assign the modal rank to every unit can reach high percent agreement while alpha is near zero, because D_e shrinks with the concentration of the margi...

2004
[17]

The Promise of Prediction Markets

"The Promise of Prediction Markets." Science 320(5878): 877-878. https://doi.org/10.1126/science.1157679. Berg, Joyce E., Forrest D. Nelson, and Thomas A. Rietz

work page doi:10.1126/science.1157679
[18]

Prediction Market Accuracy in the Long Run

"Prediction Market Accuracy in the Long Run." International Journal of Forecasting 24(2): 285-300. https://doi.org/10.1016/j.ijforecast.2008.03.007. Bürgi, Constantin, Wanying Deng, and Karl Whelan

work page doi:10.1016/j.ijforecast.2008.03.007 2008
[19]

Economization, Part 1: Shifting Attention from the Economy Towards Processes of Economization

"Economization, Part 1: Shifting Attention from the Economy Towards Processes of Economization." Economy and Society 38(3): 369-398. https://doi.org/10.1080/03085140903020580. Callon, Michel, ed

work page doi:10.1080/03085140903020580
[20]

News Droughts, News Floods, and U.S. Disaster Relief

"News Droughts, News Floods, and U.S. Disaster Relief." Quarterly Journal of Economics 122(2): 693-728. https://doi.org/10.1162/qjec.122.2.693. Elwert, Felix, and Christopher Winship

work page doi:10.1162/qjec.122.2.693
[21]

Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable

"Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable." Annual Review of Sociology 40: 31-53. https://doi.org/10.1146/annurev-soc-071913-043455. Gebele, Jonas, and Florian Matthes

work page doi:10.1146/annurev-soc-071913-043455
[22]

Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets

"Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets." arXiv:2601.01706. https://arxiv.org/abs/2601.01706. Gilardi, Fabrizio, Meysam Alizadeh, and Maël Kubli

arXiv
[23]

Chatgpt outperforms crowd workers for text-annotation tasks

"ChatGPT Outperforms Crowd Workers for Text-Annotation Tasks." Proceedings of the National Academy of Sciences 120(30): e2305016120. https://doi.org/10.1073/pnas.2305016120. Grossman, Sanford J., and Oliver D. Hart

work page doi:10.1073/pnas.2305016120
[24]

The Costs and Benefits of Ownership: A Theory of Vertical and Lateral Integration

"The Costs and Benefits of Ownership: A Theory of Vertical and Lateral Integration." Journal of Political Economy 94(4): 691-719. https://doi.org/10.1086/261404. Hanson, Robin

work page doi:10.1086/261404
[25]

Incomplete Contracts and Control

"Incomplete Contracts and Control." American Economic Review 107(7): 1731-1752. https://doi.org/10.1257/aer.107.7.1731. Hayek, F. A

work page doi:10.1257/aer.107.7.1731
[26]

Sample Selection Bias as a Specification Error

"Sample Selection Bias as a Specification Error." Econometrica 47(1): 153-161. https://doi.org/10.2307/1912352. Jia, Huaiyu, Luofeng Zhou, Wentao Zhang, Lin William Cong, Siguang Li, and Shuo Sun

work page doi:10.2307/1912352
[27]

Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: [Experiments & Analysis]

"Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: [Experiments & Analysis]." arXiv:2604.20421. https://arxiv.org/abs/2604.20421. Krippendorff, Klaus

Pith/arXiv arXiv
[28]

Decomposing Crowd Wisdom: Domain-Specific Calibration Dynamics in Prediction Markets

"Decomposing Crowd Wisdom: Domain-Specific Calibration Dynamics in Prediction Markets." arXiv:2602.19520. https://arxiv.org/abs/2602.19520. MacKenzie, Donald, and Yuval Millo

arXiv
[29]

Constructing a Market, Performing Theory: The Historical Sociology of a Financial Derivatives Exchange

"Constructing a Market, Performing Theory: The Historical Sociology of a Financial Derivatives Exchange." American Journal of Sociology 109(1): 107-145. https://doi.org/10.1086/374404. Madhavan, Ananth

work page doi:10.1086/374404
[30]

Market Microstructure: A Survey

"Market Microstructure: A Survey." Journal of Financial Markets 3(3): 205-258. https://doi.org/10.1016/S1386-4181(00)00007-0. McCarthy, John D., Clark McPhail, and Jackie Smith

work page doi:10.1016/s1386-4181(00)00007-0
[31]

Images of Protest: Dimensions of Selection Bias in Media Coverage of Washington Demonstrations, 1982 and 1991

"Images of Protest: Dimensions of Selection Bias in Media Coverage of Washington Demonstrations, 1982 and 1991." American Sociological Review 61(3): 478-499. https://doi.org/10.2307/2096360. Reichenbach, Felix, and Martin Walther

work page doi:10.2307/2096360 1982
[32]

1011 Bayesian Aspects of Treatment Choice

"The Long History of Political Betting Markets: An International Perspective." In The Oxford Handbook of the Economics of Gambling, edited by Leighton Vaughan Williams and Donald S. Siegel, 559-586. Oxford: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199797912.013.0029. Rohanifar, Yasaman, Syed Ishtiaque Ahmed, and Sharifa Sultana

work page doi:10.1093/oxfordhb/9780199797912.013.0029
[33]

Prediction Laundering: The Illusion of Neutrality, Transparency, and Governance in Polymarket

"Prediction Laundering: The Illusion of Neutrality, Transparency, and Governance in Polymarket." arXiv:2602.05181. https://arxiv.org/abs/2602.05181. Roth, Alvin E

arXiv
[34]

Repugnance as a Constraint on Markets

"Repugnance as a Constraint on Markets." Journal of Economic Perspectives 21(3): 37-58. https://doi.org/10.1257/jep.21.3.37. Scott, James C

work page doi:10.1257/jep.21.3.37
[35]

Innovation, Competition, and New Contract Design in Futures Markets

"Innovation, Competition, and New Contract Design in Futures Markets." Journal of Futures Markets 1(2): 123-155. https://doi.org/10.1002/fut.3990010205. Snowberg, Erik, Justin Wolfers, and Eric Zitzewitz

work page doi:10.1002/fut.3990010205
[36]

Prediction Markets for Economic Forecasting

"Prediction Markets for Economic Forecasting." In Handbook of Economic Forecasting, vol. 2A, edited by Graham Elliott and Allan Timmermann, 657-687. Amsterdam: Elsevier. https://doi.org/10.1016/B978-0-444-53683-9.00011-6. Törnberg, Petter

work page doi:10.1016/b978-0-444-53683-9.00011-6
[37]

ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning

"ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning." arXiv:2304.06588. https://arxiv.org/abs/2304.06588. Wolfers, Justin, and Eric Zitzewitz

arXiv
[38]

Prediction Markets

"Prediction Markets." Journal of Economic Perspectives 18(2): 107-126. https://doi.org/10.1257/0895330041371321. Wolfers, Justin, and Eric Zitzewitz

work page doi:10.1257/0895330041371321
[39]

Prediction Markets in Theory and Practice

"Prediction Markets in Theory and Practice." NBER Working Paper No. 12083. https://doi.org/10.3386/w12083. Ziems, Caleb, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang

work page doi:10.3386/w12083
[40]

Ziems, W

"Can Large Language Models Transform Computational Social Science?" Computational Linguistics 50(1): 237-291. https://doi.org/10.1162/coli_a_00502

work page doi:10.1162/coli_a_00502

[1] [1]

4.2 Sports versus non-sports The sports/non-sports split gives the cleanest descriptive contrast

c activity is sports-led, while Latin America-topic activity is non-sports-led, with Venezuela and elections carrying much of the observed value. 4.2 Sports versus non-sports The sports/non-sports split gives the cleanest descriptive contrast. It separates a highly standardized class of contracts from the civic and economic categories that motivate the fo...

2024

[2] [2]

In the dated portion of the sample, African non-sports observed value grows from $0.26 million in 2023 to $7.94 million in

The 777 analytical-sample contracts without creation timestamps do not generate the result, since they carry $12.6 million in Africa observed value and $112.3 million in Latin America observed value. In the dated portion of the sample, African non-sports observed value grows from $0.26 million in 2023 to $7.94 million in

2023

[3] [3]

Africa sports, Latin America politics

The African civic pattern is therefore thin but rising, not static absence. 4.3 Africa and AFCON The Africa sports result is concentrated in a specific repeated tournament class. AFCON matters analytically because it offers the combination of named teams, scheduled fixtures, official scores, and repeatable contract templates that lowers listing and settle...

2023

[4] [4]

who will win the next election in country c?

D3 (Closure precision). Could a trader mark the resolution moment on a calendar in advance? D3 is scored from a decision tree that distinguishes scheduled moments, bounded windows, and open-ended triggers, with an explicit branch for institutions whose calendars exist on paper but not in practice. Our primary legibility score is the sum of the first two d...

1998

[5] [5]

sports" or

The gradient documented below is this logic applied 451 times under blind conditions, not a judgment about which events matter. 5.2 Coding protocol and reliability The scientific content of the instrument lies in the codebook, and the codebook is authored, frozen, and human-owned. We wrote the construct definition, the anchors for each dimension, and the ...

2023

[6] [6]

The conventional floor for drawing tentative conclusions is α ≥ 0.67; we adopted this floor in advance for every reliability claim we make

Human benchmark, stage two Fresh 24-unit blind human revalidation of D3 D3 passes We measure reliability with Krippendorff's alpha for ordinal data, α = 1 − D_o / D_e, (2) where D_o is the observed mean squared ordinal distance between paired codes and D_e is the distance expected under chance assignment from the empirical marginal distribution (Krippendo...

2004

[7] [7]

A test of formation requires a denominator of events that could have become markets but did not necessarily do so

A formation test against an external event frame The contract inventory alone cannot identify formation because it contains only successful listings. A test of formation requires a denominator of events that could have become markets but did not necessarily do so. We therefore construct an external frame of salient civic events in Africa and Latin America...

2022

[8] [8]

Elections and referenda enter from the IFES ElectionGuide calendar

It is built from sources external to the platforms and frozen, with written selection rules, before matching to the contract data. Elections and referenda enter from the IFES ElectionGuide calendar. Every national election and national referendum in the 85-country universe with an event day inside the window is included, aggregated to the country-day occa...

2022

[9] [9]

= Λ( β₀ + β₁ L_e + β₂ ln S_e + β₃ LatAm_e ), (5) where F_e indicates that event e matched at least one contract, L_e is the two-dimension legibility score, S_e is the salience proxy in (3), LatAm_e indicates a Latin America event, and Λ is the logistic function. We pre-specified a Fisher exact test on a high-low legibility split as the fallback under sepa...

2023

[10] [10]

First, we do not observe trader location

Limitations Six limitations delimit the interpretation. First, we do not observe trader location. Topic geography cannot be interpreted as participant geography or local demand. Second, observed trade-derived value is incomplete. It is based on collected trade records where available, not complete order-book liquidity. Missing trade-derived value is uneve...

2021

[11] [11]

Source fetches run from November 2025 to February

2025

[12] [12]

We study that formation layer for Africa-topic and Latin America-topic contracts on Polymarket and Kalshi

Conclusion Prediction markets reveal platform formation before they reveal trader beliefs. We study that formation layer for Africa-topic and Latin America-topic contracts on Polymarket and Kalshi. The audited sample shows that Africa is present but selectively visible, with observed value dominated by football and especially by AFCON, while Latin America...

2024

[13] [13]

Two AI coders from different model families coded independently on blinded, shuffled inputs containing only the unit id, question or event description, and resolution source

D.2 Coding protocol and reliability The production coding covered 451 units: 320 contracts and 131 external-frame events. Two AI coders from different model families coded independently on blinded, shuffled inputs containing only the unit id, question or event description, and resolution source. Sector, salience, notional value, and outcome fields were st...

2004

[14] [14]

Salience is complete for all 131 events and is measured from English Wikipedia country-page views

It combines 114 IFES ElectionGuide national election or referendum occasions with 17 stratum-B events: 12 coup attempts, 3 armed-conflict onsets under the UCDP episode-start rule, and 2 irregular leadership exits of chief executives. Salience is complete for all 131 events and is measured from English Wikipedia country-page views. The boundary rules were ...

2022

[15] [15]

Since κ − a′ − b < κ − a − b and (ln Ḡ)′ is non-increasing, this derivative is non-positive: the ratio is non-increasing in b

∝ Ḡ(κ − a′ − b) / Ḡ(κ − a − b) has logarithmic derivative in b equal to (ln Ḡ)′(κ − a − b) − (ln Ḡ)′(κ − a′ − b). Since κ − a′ − b < κ − a − b and (ln Ḡ)′ is non-increasing, this derivative is non-positive: the ratio is non-increasing in b. The conditional law of b given a, F = 1 is therefore decreasing in a in the likelihood-ratio order, which implies fi...

1979

[16] [16]

Agreement that merely reflects both coders using a dominant category is thus credited as chance, not as reliability. This is the property that separates alpha from raw percent agreement: two coders who both assign the modal rank to every unit can reach high percent agreement while alpha is near zero, because D_e shrinks with the concentration of the margi...

2004

[17] [17]

The Promise of Prediction Markets

"The Promise of Prediction Markets." Science 320(5878): 877-878. https://doi.org/10.1126/science.1157679. Berg, Joyce E., Forrest D. Nelson, and Thomas A. Rietz

work page doi:10.1126/science.1157679

[18] [18]

Prediction Market Accuracy in the Long Run

"Prediction Market Accuracy in the Long Run." International Journal of Forecasting 24(2): 285-300. https://doi.org/10.1016/j.ijforecast.2008.03.007. Bürgi, Constantin, Wanying Deng, and Karl Whelan

work page doi:10.1016/j.ijforecast.2008.03.007 2008

[19] [19]

Economization, Part 1: Shifting Attention from the Economy Towards Processes of Economization

"Economization, Part 1: Shifting Attention from the Economy Towards Processes of Economization." Economy and Society 38(3): 369-398. https://doi.org/10.1080/03085140903020580. Callon, Michel, ed

work page doi:10.1080/03085140903020580

[20] [20]

News Droughts, News Floods, and U.S. Disaster Relief

"News Droughts, News Floods, and U.S. Disaster Relief." Quarterly Journal of Economics 122(2): 693-728. https://doi.org/10.1162/qjec.122.2.693. Elwert, Felix, and Christopher Winship

work page doi:10.1162/qjec.122.2.693

[21] [21]

Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable

"Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable." Annual Review of Sociology 40: 31-53. https://doi.org/10.1146/annurev-soc-071913-043455. Gebele, Jonas, and Florian Matthes

work page doi:10.1146/annurev-soc-071913-043455

[22] [22]

Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets

"Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets." arXiv:2601.01706. https://arxiv.org/abs/2601.01706. Gilardi, Fabrizio, Meysam Alizadeh, and Maël Kubli

arXiv

[23] [23]

Chatgpt outperforms crowd workers for text-annotation tasks

"ChatGPT Outperforms Crowd Workers for Text-Annotation Tasks." Proceedings of the National Academy of Sciences 120(30): e2305016120. https://doi.org/10.1073/pnas.2305016120. Grossman, Sanford J., and Oliver D. Hart

work page doi:10.1073/pnas.2305016120

[24] [24]

The Costs and Benefits of Ownership: A Theory of Vertical and Lateral Integration

"The Costs and Benefits of Ownership: A Theory of Vertical and Lateral Integration." Journal of Political Economy 94(4): 691-719. https://doi.org/10.1086/261404. Hanson, Robin

work page doi:10.1086/261404

[25] [25]

Incomplete Contracts and Control

"Incomplete Contracts and Control." American Economic Review 107(7): 1731-1752. https://doi.org/10.1257/aer.107.7.1731. Hayek, F. A

work page doi:10.1257/aer.107.7.1731

[26] [26]

Sample Selection Bias as a Specification Error

"Sample Selection Bias as a Specification Error." Econometrica 47(1): 153-161. https://doi.org/10.2307/1912352. Jia, Huaiyu, Luofeng Zhou, Wentao Zhang, Lin William Cong, Siguang Li, and Shuo Sun

work page doi:10.2307/1912352

[27] [27]

Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: [Experiments & Analysis]

"Unlocking the Forecasting Economy: A Suite of Datasets for the Full Lifecycle of Prediction Market: [Experiments & Analysis]." arXiv:2604.20421. https://arxiv.org/abs/2604.20421. Krippendorff, Klaus

Pith/arXiv arXiv

[28] [28]

Decomposing Crowd Wisdom: Domain-Specific Calibration Dynamics in Prediction Markets

"Decomposing Crowd Wisdom: Domain-Specific Calibration Dynamics in Prediction Markets." arXiv:2602.19520. https://arxiv.org/abs/2602.19520. MacKenzie, Donald, and Yuval Millo

arXiv

[29] [29]

Constructing a Market, Performing Theory: The Historical Sociology of a Financial Derivatives Exchange

"Constructing a Market, Performing Theory: The Historical Sociology of a Financial Derivatives Exchange." American Journal of Sociology 109(1): 107-145. https://doi.org/10.1086/374404. Madhavan, Ananth

work page doi:10.1086/374404

[30] [30]

Market Microstructure: A Survey

"Market Microstructure: A Survey." Journal of Financial Markets 3(3): 205-258. https://doi.org/10.1016/S1386-4181(00)00007-0. McCarthy, John D., Clark McPhail, and Jackie Smith

work page doi:10.1016/s1386-4181(00)00007-0

[31] [31]

Images of Protest: Dimensions of Selection Bias in Media Coverage of Washington Demonstrations, 1982 and 1991

"Images of Protest: Dimensions of Selection Bias in Media Coverage of Washington Demonstrations, 1982 and 1991." American Sociological Review 61(3): 478-499. https://doi.org/10.2307/2096360. Reichenbach, Felix, and Martin Walther

work page doi:10.2307/2096360 1982

[32] [32]

1011 Bayesian Aspects of Treatment Choice

"The Long History of Political Betting Markets: An International Perspective." In The Oxford Handbook of the Economics of Gambling, edited by Leighton Vaughan Williams and Donald S. Siegel, 559-586. Oxford: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199797912.013.0029. Rohanifar, Yasaman, Syed Ishtiaque Ahmed, and Sharifa Sultana

work page doi:10.1093/oxfordhb/9780199797912.013.0029

[33] [33]

Prediction Laundering: The Illusion of Neutrality, Transparency, and Governance in Polymarket

"Prediction Laundering: The Illusion of Neutrality, Transparency, and Governance in Polymarket." arXiv:2602.05181. https://arxiv.org/abs/2602.05181. Roth, Alvin E

arXiv

[34] [34]

Repugnance as a Constraint on Markets

"Repugnance as a Constraint on Markets." Journal of Economic Perspectives 21(3): 37-58. https://doi.org/10.1257/jep.21.3.37. Scott, James C

work page doi:10.1257/jep.21.3.37

[35] [35]

Innovation, Competition, and New Contract Design in Futures Markets

"Innovation, Competition, and New Contract Design in Futures Markets." Journal of Futures Markets 1(2): 123-155. https://doi.org/10.1002/fut.3990010205. Snowberg, Erik, Justin Wolfers, and Eric Zitzewitz

work page doi:10.1002/fut.3990010205

[36] [36]

Prediction Markets for Economic Forecasting

"Prediction Markets for Economic Forecasting." In Handbook of Economic Forecasting, vol. 2A, edited by Graham Elliott and Allan Timmermann, 657-687. Amsterdam: Elsevier. https://doi.org/10.1016/B978-0-444-53683-9.00011-6. Törnberg, Petter

work page doi:10.1016/b978-0-444-53683-9.00011-6

[37] [37]

ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning

"ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning." arXiv:2304.06588. https://arxiv.org/abs/2304.06588. Wolfers, Justin, and Eric Zitzewitz

arXiv

[38] [38]

Prediction Markets

"Prediction Markets." Journal of Economic Perspectives 18(2): 107-126. https://doi.org/10.1257/0895330041371321. Wolfers, Justin, and Eric Zitzewitz

work page doi:10.1257/0895330041371321

[39] [39]

Prediction Markets in Theory and Practice

"Prediction Markets in Theory and Practice." NBER Working Paper No. 12083. https://doi.org/10.3386/w12083. Ziems, Caleb, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang

work page doi:10.3386/w12083

[40] [40]

Ziems, W

"Can Large Language Models Transform Computational Social Science?" Computational Linguistics 50(1): 237-291. https://doi.org/10.1162/coli_a_00502

work page doi:10.1162/coli_a_00502