The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S. Presidential Election

Jay Chooi

arxiv: 2606.12889 · v1 · pith:SJ5CSBUDnew · submitted 2026-06-11 · 📊 stat.AP

The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S. Presidential Election

Jay Chooi This is my paper

Pith reviewed 2026-06-27 05:27 UTC · model grok-4.3

classification 📊 stat.AP

keywords non-response biasdata defect correlationsample matching2024 presidential electionpolling errorbias correctionturnout adjustment

0 comments

The pith

Non-response bias against Trump voters persisted in 2024 sample-matched polls at levels similar to 2016, and a correction using only prior election data reduces RMSE from 0.13 to 0.05.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies the data defect correlation framework to a large 2024 survey and shows that bias against Trump voters remains after sample matching to population demographics. It also identifies positive response bias for Harris voters once turnout is adjusted for. The authors build a pre-election correction that draws solely on defect correlations and turnout patterns from earlier elections. This estimator brings root mean square error down from 0.13 to 0.05, performing about as well as methods that require post-election information. The findings indicate that conventional sample matching alone does not remove the bias that has affected recent presidential polls.

Core claim

Reanalysis of the Cooperative Election Study shows non-response bias for Trump voters at ρ = -0.0030 in 2024, close to the -0.0045 value recorded in 2016, even after sample matching to the U.S. adult population. Positive response bias for Harris voters emerges after turnout adjustment. Errors scale with state population size, and effective sample sizes fall by more than 99 percent in the largest states. A pre-election bias correction estimator, informed only by historical defect correlations and turnout rates, lowers RMSE from 0.13 to 0.05 and matches the performance of post-election weighting at 0.09.

What carries the argument

The data defect correlation, which measures the association between an individual's probability of responding to the survey and their vote choice, used to quantify and correct persistent non-response bias.

If this is right

Polling errors continue to grow with state population size unless the historical correction is applied.
Standard confidence intervals become increasingly unreliable as sample size increases because of the large reduction in effective sample size.
Pre-election adjustments based on past cycles can reach accuracy levels previously available only after election results are known.
Sample matching to demographics alone leaves measurable non-response bias intact across election cycles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be tested prospectively on the next national election by applying the same historical parameters before votes are counted.
If defect correlations prove stable across cycles, survey designers may shift resources from larger matched samples toward bias modeling.
The same correction logic may apply to other surveys where response propensity correlates with the measured outcome, such as health or economic polls.

Load-bearing premise

The data defect correlation values and turnout adjustments observed in prior elections remain stable and transferable predictors for correcting 2024 sample-matched polls without additional 2024-specific fitting.

What would settle it

Re-running the proposed estimator on 2024 state-level outcomes and obtaining an RMSE materially above 0.05 or worse than the post-election weighting benchmark of 0.09.

Figures

Figures reproduced from arXiv: 2606.12889 by Jay Chooi.

**Figure 1.** Figure 1: Poll estimate vs actual vote share for Harris and Trump for each state, using raw polls and turnout-adjusted [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Histogram of data defect correlation for Harris and Trump voters for each state in the 2024 US presidential [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Error in SRS-units Zn,N to the number of total votes in each state. Wyoming) are within the confidence intervals. For Harris, we also found that some states with bigger samples exit the confidence intervals, though the transgression is not as severe and numerous as compared to Trump. 3.4 Computing the effective sample sizes Using the methods in Section 2.4, we compute the effective sample size for each sta… view at source ↗

**Figure 4.** Figure 4: Standardized error Zn to sample size for each state, using validated voters [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Bias-corrected raw poll estimate of Trump’s vote share in 2024. There is no consistent underestimation of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

Donald Trump won the 2024 US Presidential Election despite polls predicting a Democratic lead, echoing the polling miss in 2016. Using the data defect correlation framework, we revisit the 60,000-respondent Cooperative Election Study and find that non-response bias for Trump voters persists on the same order of magnitude ($\rho=-0.0030$ vs $-0.0045$ in 2016) even under sample-matching to the US adult population. We additionally find evidence of positive response bias for Harris voters after adjusting for turnout. Consistent with findings in 2016, polling errors scale with state population size, and larger samples produce greater departures from conventional confidence intervals, with reductions of effective sample size exceeding 99% in the largest states. We propose a pre-election bias correction estimator informed by historical data defect correlations and turnout rates that decreases RMSE from 0.13 to 0.05 using only prior election data, comparable to post-election weighting (RMSE 0.09).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds 2024-specific ρ measurements close to 2016 and tests a historical-data correction that claims big RMSE drop, but the transferability of those parameters is the unproven hinge.

read the letter

The new material here is the 2024 Cooperative Election Study numbers: non-response bias for Trump voters sits at ρ = -0.0030, not far from the 2016 value of -0.0045, and the authors report a positive bias for Harris voters once turnout is adjusted. They also confirm that polling error scales with state population size and that effective sample sizes collapse in large states. On top of that they put forward a pre-election estimator that uses only earlier elections' correlations and turnout rates, cutting RMSE from 0.13 to 0.05 on the 2024 data—roughly matching what post-election weighting achieves.

Those measurements and the scaling observation are straightforward extensions of the data-defect framework and worth having on record. The pre-election framing is the part that could matter for practice if it holds.

The soft spot is exactly the one the stress-test flags. The estimator deliberately withholds 2024 information, so its reported gain assumes the 2016-era ρ and turnout adjustments still apply. The paper notes that the 2024 ρ is similar, but similarity after the fact does not prove the correction would have worked in real time if the underlying response patterns had shifted. Without the full methods section it is also unclear how the estimator was constructed, what sample restrictions were used, or whether the 0.05 RMSE reflects truly out-of-sample performance or some post-hoc tuning. Those details matter because the claim is practical rather than theoretical.

This is for readers who follow polling methodology and election forecasting. The fresh 2024 data points and the concrete estimator give it enough weight for a serious referee, even if the transferability question will need direct answers.

Referee Report

2 major / 1 minor

Summary. The paper claims that non-response bias persists in sample-matched polls for the 2024 U.S. Presidential Election at a data defect correlation of ρ=-0.0030 (vs. -0.0045 in 2016) even after matching to the US adult population, with evidence of positive response bias for Harris voters after turnout adjustment. It further claims that polling errors scale with state population size and that a pre-election bias correction estimator, informed only by historical data defect correlations and turnout rates, reduces RMSE from 0.13 to 0.05 (comparable to post-election weighting at RMSE 0.09).

Significance. If the transferability assumption holds, the work offers a notable contribution by demonstrating persistent non-response bias in modern sample-matched polls and providing a concrete pre-election correction method that achieves substantial RMSE improvement using only prior-election data. The explicit reporting of metrics such as ρ values and RMSE reductions, along with the population-size scaling observation, strengthens the empirical grounding and has clear implications for survey methodology and election polling practice.

major comments (2)

[Proposed pre-election bias correction estimator] The central RMSE reduction claim (0.13 to 0.05) for the pre-election estimator is load-bearing on the assumption that 2016-era data defect correlations (≈−0.0045) and turnout adjustments transfer stably to 2024 without refitting; the manuscript provides no sensitivity analysis, cross-validation, or robustness check against plausible deviations in these historical parameters (e.g., due to mode shifts or candidate effects), leaving the out-of-sample performance unverified.
[Data defect correlation framework and 2024 CES analysis] The reported 2024 CES ρ=-0.0030 is presented as comparable to the historical value used in the estimator, but the text does not explicitly confirm that the estimator's application to 2024 polls withholds all 2024 information (including any indirect use via turnout rates or sample-matching details); this distinction is required to substantiate the pre-election and non-circular nature of the RMSE improvement.

minor comments (1)

[Abstract] The abstract states that 'larger samples produce greater departures from conventional confidence intervals' and 'reductions of effective sample size exceeding 99% in the largest states' without naming the states, providing the exact effective-sample-size formula, or showing the supporting table/figure; adding these details would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments, which help clarify key aspects of our pre-election estimator. We address each major comment below and have revised the manuscript accordingly to improve transparency and robustness.

read point-by-point responses

Referee: [Proposed pre-election bias correction estimator] The central RMSE reduction claim (0.13 to 0.05) for the pre-election estimator is load-bearing on the assumption that 2016-era data defect correlations (≈−0.0045) and turnout adjustments transfer stably to 2024 without refitting; the manuscript provides no sensitivity analysis, cross-validation, or robustness check against plausible deviations in these historical parameters (e.g., due to mode shifts or candidate effects), leaving the out-of-sample performance unverified.

Authors: We agree that a sensitivity analysis would strengthen the transferability claim. The manuscript relies on the empirical similarity between the 2016 ρ value and the independently observed 2024 CES ρ, but does not include formal robustness checks. We will add an appendix with sensitivity analyses that vary the input historical ρ by plausible ranges (e.g., ±20% around −0.0045) and recompute the corrected estimates and RMSE; preliminary checks indicate the RMSE reduction remains substantial across these ranges. This revision will directly address the out-of-sample verification concern. revision: yes
Referee: [Data defect correlation framework and 2024 CES analysis] The reported 2024 CES ρ=-0.0030 is presented as comparable to the historical value used in the estimator, but the text does not explicitly confirm that the estimator's application to 2024 polls withholds all 2024 information (including any indirect use via turnout rates or sample-matching details); this distinction is required to substantiate the pre-election and non-circular nature of the RMSE improvement.

Authors: We appreciate the referee's emphasis on explicit separation of information. The estimator is constructed solely from 2016 data defect correlations and pre-2024 turnout rates; no 2024 CES data, sample-matching weights, or 2024 turnout information enters the correction applied to the 2024 polls. The 2024 CES ρ is reported only as a post-hoc validation of bias persistence and plays no role in the estimator. We will revise the methods and results sections to state this information partition explicitly, including a sentence confirming that the RMSE calculation uses only historical inputs. revision: yes

Circularity Check

0 steps flagged

No circularity: out-of-sample transfer of historical parameters

full rationale

The paper's central estimator is constructed exclusively from prior-election data defect correlations and turnout rates, then applied to 2024 sample-matched polls; the reported RMSE reduction (0.13 to 0.05) is computed against actual 2024 election outcomes, which are external to the parameter estimation step. No equations or claims reduce the 2024 correction to a fit that includes 2024 data, and no self-citation or uniqueness theorem is invoked. The stability assumption is stated as an assumption rather than derived, making the evaluation a genuine out-of-sample test rather than a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the transferability of historical correlations; no explicit free parameters beyond the reported measured ρ values are identified, and no new entities are postulated.

axioms (1)

domain assumption The data defect correlation framework applies directly to sample-matched polls and can be used to quantify non-response bias.
The paper invokes this framework to interpret the 2024 data and construct the correction.

pith-pipeline@v0.9.1-grok · 5701 in / 1342 out tokens · 29101 ms · 2026-06-27T05:27:30.136275+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 6 canonical work pages

[1]

2024 Presidential Election Polls,

270toWin. 2024 Presidential Election Polls,

2024
[2]

Michael A

URLhttps://www.270towin.com/ 2024-presidential-election-polls/. Michael A. Bailey. A New Paradigm for Polling.Harvard Data Science Review, 5(3), July

2024
[3]

doi: 10.1162/99608f92.9898eede

ISSN 2644- 2353, 688-8513. doi: 10.1162/99608f92.9898eede. URLhttps://hdsr.mitpress.mit.edu/pub/ejk5yhgv/ release/4. Michael A. Bailey.Polling at a crossroads: Rethinking modern survey research. Methodological tools in the social sciences. Cambridge University Press, Cambridge,

work page doi:10.1162/99608f92.9898eede
[4]

ISSN 0362-4331. URLhttps://www.nytimes.com/interactive/2016/09/ 20/upshot/the-error-the-polling-world-rarely-talks-about.html,https://www.nytimes.com/ interactive/2016/09/20/upshot/the-error-the-polling-world-rarely-talks-about.html. Nate Cohn. Polling and the 2024 election, December

2016
[5]

Cook Political Report

URLhttps://news.berkeley.edu/2023/12/01/ berkeley-talks-nate-cohn-polling/. Cook Political Report. 2024 CPR President Race Ratings,

2023
[6]

Alexander Coppock

URLhttps://www.cookpolitical.com/ ratings/presidential-race-ratings. Alexander Coppock. Did Shy Trump Supporters Bias the 2016 Polls? Evidence from a Nationally-representative List Experiment.Statistics, Politics and Policy, 8(1):29–40, October

2016
[7]

doi: 10.1515/ spp-2016-0005

ISSN 2151-7509, 2194-6299. doi: 10.1515/ spp-2016-0005. URLhttps://www.degruyter.com/document/doi/10.1515/spp-2016-0005/html. Jeff Dominitz and Charles F. Manski. Using Total Margin of Error to Account for Non-Sampling Error in Election Polls: The Case of Nonresponse, October

work page doi:10.1515/spp-2016-0005/html 2016
[8]

arXiv:2407.19339 [econ]

URLhttp://arxiv.org/abs/2407.19339. arXiv:2407.19339 [econ]. Federal Election Commission (FEC). Election results and voting information,

work page arXiv
[9]

Federal Election Commission

URLhttps://www.fec.gov/ introduction-campaign-finance/election-results-and-voting-information/. Federal Election Commission. 2024 Presidential Election Results. Report, Federal Election Commission, January

2024
[10]

FiveThirtyEight

URLhttps://www.fec.gov/resources/cms-content/documents/2024presgeresults.pdf. FiveThirtyEight. State of the Polls 2024,

2024
[11]

Andrew Gelman, Ben Goodrich, and Geonhee Han

URLhttps://github.com/fivethirtyeight/data/tree/ master/state-of-the-polls-2024. Andrew Gelman, Ben Goodrich, and Geonhee Han. Grappling With Uncertainty in Forecasting the 2024 U.S. Pres- idential Election.Harvard Data Science Review, 6(4), October

2024
[12]

doi: 10.1162/99608f92

ISSN 2644-2353,. doi: 10.1162/99608f92. a919e3fa. URLhttps://hdsr.mitpress.mit.edu/pub/yoa73r1m/release/1. Michael Isakov and Shiro Kuriwaki. Towards principled unskewing: Viewing 2020 election polls through a corrective lens from 2016.Harvard Data Science Review, 2(4):69,

work page doi:10.1162/99608f92 2020
[13]

Michael McDonald

URLhttps://assets.pubpub.org/y42o9vjw/ 51603809090456.pdf. Michael McDonald. 2016 General Election Turnout Rates (v1.0),

2016
[14]

Michael McDonald

URLhttps://election.lab.ufl.edu/ dataset/2016-general-election-turnout-rates/. Michael McDonald. 2024 General Election Turnout Rates (v0.3),

2016
[15]

Xiao-Li Meng

URLhttps://election.lab.ufl.edu/ dataset/2024-general-election-turnout-rates-v0-3/. Xiao-Li Meng. Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election.The Annals of Applied Statistics, 12(2), June

2024
[16]

ISSN 1932-6157. doi:

1932
[17]

URLhttps://projecteuclid.org/journals/annals-of-applied-statistics/ volume-12/issue-2/Statistical-paradises-and-paradoxes-in-big-data-I--Law/10.1214/ 18-AOAS1161SF.full

1214/18-AOAS1161SF. URLhttps://projecteuclid.org/journals/annals-of-applied-statistics/ volume-12/issue-2/Statistical-paradises-and-paradoxes-in-big-data-I--Law/10.1214/ 18-AOAS1161SF.full. Brian Schaffner, Marissa Shih, Stephen Ansolabehere, and Jeremy Pope. Cooperative Election Study Common Content, 2024, April

2024
[18]

URLhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi: 10.7910/DVN/X11EP6. U.S. Census Bureau. 2023 american community survey 1-year estimates,

work page doi:10.7910/dvn/x11ep6 2023
[19]

10 The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S

URLhttps://data.census.gov. 10 The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S. Presidential Election Appendix A Data Availability Statement All data used in this study are publicly available from third-party sources: •Cooperative Election Study (CES) 2024 Common Content.The 60,000-respondent survey analyzed throughout this pape...

work page doi:10.7910/dvn/x11ep6(schaffner 2024
[20]

I’m not sure

in each state for the 2024 US presidential election. Table D.1: Sample size, total votes, effective sample size and percentage reduction of samples in each state for the 2024 US presidential election State Sample size Total votes Effective sample size Percentage reduction Alabama 882 2,256,352 20 97.71% Alaska 117 338,177 18 84.72% Arizona 1,162 3,389,319...

2024

[1] [1]

2024 Presidential Election Polls,

270toWin. 2024 Presidential Election Polls,

2024

[2] [2]

Michael A

URLhttps://www.270towin.com/ 2024-presidential-election-polls/. Michael A. Bailey. A New Paradigm for Polling.Harvard Data Science Review, 5(3), July

2024

[3] [3]

doi: 10.1162/99608f92.9898eede

ISSN 2644- 2353, 688-8513. doi: 10.1162/99608f92.9898eede. URLhttps://hdsr.mitpress.mit.edu/pub/ejk5yhgv/ release/4. Michael A. Bailey.Polling at a crossroads: Rethinking modern survey research. Methodological tools in the social sciences. Cambridge University Press, Cambridge,

work page doi:10.1162/99608f92.9898eede

[4] [4]

ISSN 0362-4331. URLhttps://www.nytimes.com/interactive/2016/09/ 20/upshot/the-error-the-polling-world-rarely-talks-about.html,https://www.nytimes.com/ interactive/2016/09/20/upshot/the-error-the-polling-world-rarely-talks-about.html. Nate Cohn. Polling and the 2024 election, December

2016

[5] [5]

Cook Political Report

URLhttps://news.berkeley.edu/2023/12/01/ berkeley-talks-nate-cohn-polling/. Cook Political Report. 2024 CPR President Race Ratings,

2023

[6] [6]

Alexander Coppock

URLhttps://www.cookpolitical.com/ ratings/presidential-race-ratings. Alexander Coppock. Did Shy Trump Supporters Bias the 2016 Polls? Evidence from a Nationally-representative List Experiment.Statistics, Politics and Policy, 8(1):29–40, October

2016

[7] [7]

doi: 10.1515/ spp-2016-0005

ISSN 2151-7509, 2194-6299. doi: 10.1515/ spp-2016-0005. URLhttps://www.degruyter.com/document/doi/10.1515/spp-2016-0005/html. Jeff Dominitz and Charles F. Manski. Using Total Margin of Error to Account for Non-Sampling Error in Election Polls: The Case of Nonresponse, October

work page doi:10.1515/spp-2016-0005/html 2016

[8] [8]

arXiv:2407.19339 [econ]

URLhttp://arxiv.org/abs/2407.19339. arXiv:2407.19339 [econ]. Federal Election Commission (FEC). Election results and voting information,

work page arXiv

[9] [9]

Federal Election Commission

URLhttps://www.fec.gov/ introduction-campaign-finance/election-results-and-voting-information/. Federal Election Commission. 2024 Presidential Election Results. Report, Federal Election Commission, January

2024

[10] [10]

FiveThirtyEight

URLhttps://www.fec.gov/resources/cms-content/documents/2024presgeresults.pdf. FiveThirtyEight. State of the Polls 2024,

2024

[11] [11]

Andrew Gelman, Ben Goodrich, and Geonhee Han

URLhttps://github.com/fivethirtyeight/data/tree/ master/state-of-the-polls-2024. Andrew Gelman, Ben Goodrich, and Geonhee Han. Grappling With Uncertainty in Forecasting the 2024 U.S. Pres- idential Election.Harvard Data Science Review, 6(4), October

2024

[12] [12]

doi: 10.1162/99608f92

ISSN 2644-2353,. doi: 10.1162/99608f92. a919e3fa. URLhttps://hdsr.mitpress.mit.edu/pub/yoa73r1m/release/1. Michael Isakov and Shiro Kuriwaki. Towards principled unskewing: Viewing 2020 election polls through a corrective lens from 2016.Harvard Data Science Review, 2(4):69,

work page doi:10.1162/99608f92 2020

[13] [13]

Michael McDonald

URLhttps://assets.pubpub.org/y42o9vjw/ 51603809090456.pdf. Michael McDonald. 2016 General Election Turnout Rates (v1.0),

2016

[14] [14]

Michael McDonald

URLhttps://election.lab.ufl.edu/ dataset/2016-general-election-turnout-rates/. Michael McDonald. 2024 General Election Turnout Rates (v0.3),

2016

[15] [15]

Xiao-Li Meng

URLhttps://election.lab.ufl.edu/ dataset/2024-general-election-turnout-rates-v0-3/. Xiao-Li Meng. Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election.The Annals of Applied Statistics, 12(2), June

2024

[16] [16]

ISSN 1932-6157. doi:

1932

[17] [17]

URLhttps://projecteuclid.org/journals/annals-of-applied-statistics/ volume-12/issue-2/Statistical-paradises-and-paradoxes-in-big-data-I--Law/10.1214/ 18-AOAS1161SF.full

1214/18-AOAS1161SF. URLhttps://projecteuclid.org/journals/annals-of-applied-statistics/ volume-12/issue-2/Statistical-paradises-and-paradoxes-in-big-data-I--Law/10.1214/ 18-AOAS1161SF.full. Brian Schaffner, Marissa Shih, Stephen Ansolabehere, and Jeremy Pope. Cooperative Election Study Common Content, 2024, April

2024

[18] [18]

URLhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi: 10.7910/DVN/X11EP6. U.S. Census Bureau. 2023 american community survey 1-year estimates,

work page doi:10.7910/dvn/x11ep6 2023

[19] [19]

10 The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S

URLhttps://data.census.gov. 10 The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S. Presidential Election Appendix A Data Availability Statement All data used in this study are publicly available from third-party sources: •Cooperative Election Study (CES) 2024 Common Content.The 60,000-respondent survey analyzed throughout this pape...

work page doi:10.7910/dvn/x11ep6(schaffner 2024

[20] [20]

I’m not sure

in each state for the 2024 US presidential election. Table D.1: Sample size, total votes, effective sample size and percentage reduction of samples in each state for the 2024 US presidential election State Sample size Total votes Effective sample size Percentage reduction Alabama 882 2,256,352 20 97.71% Alaska 117 338,177 18 84.72% Arizona 1,162 3,389,319...

2024