Recognition: unknown
Demographic Divides in Political Content Exposure on Facebook
Pith reviewed 2026-05-07 12:29 UTC · model grok-4.3
The pith
Political content forms only 18% of Facebook users' potential information diets, with large persistent differences across age, gender, and race.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By collecting the complete lists of public pages and groups followed by over 1,100 users and examining the posts those accounts produced from 2012 to 2023, the authors establish that political content constitutes 18% of a user's potential information diet, which is otherwise dominated by lifestyle and entertainment topics, while revealing significant and stable disparities in both the amount and the ideological direction of political content across age, gender, and racial groups, along with political content appearing inside non-political categories and a sharp rise in the political share after the 2018 Meaningful Social Interactions update.
What carries the argument
The longitudinal dataset built from each user's full list of followed public pages and groups, used as a proxy for potential information exposure across hundreds of millions of posts.
If this is right
- Assessments of Facebook's role in civic life must measure exposure directly rather than rely on engagement metrics alone.
- Demographic groups experience substantially different volumes and ideological leans of political content over long periods.
- Platform ranking changes can quickly increase or decrease the political portion of users' information diets.
- Political discourse circulates inside lifestyle and entertainment spaces, so boundaries between categories are porous.
- Longitudinal, user-follow-based data can reveal patterns that short-term or aggregate studies miss.
Where Pith is reading between the lines
- The observed demographic gaps may help explain differences in political knowledge or attitudes across groups.
- If the public-follows measure is valid, then private groups and direct messages could add even more variation to actual exposure.
- The same followed-pages method could be applied to other platforms to test whether similar modest shares and divides appear elsewhere.
- Efforts to reduce political polarization on social media may need to address content in non-political spaces as well.
Load-bearing premise
The lists of public pages and groups followed by the users accurately represent their overall potential information environment without major bias from missing private groups or unlisted content.
What would settle it
A comparison using the same users' actual post impressions or engagement data that shows a political share far from 18% would indicate the followed-pages proxy does not capture real exposure.
Figures
read the original abstract
Despite Facebook's central role in American civic life, a clear, evidence-based understanding of users' long-term information environments has remained elusive, hindering assessments of the platform's societal impact. This study addresses that gap by analyzing a unique decade-long dataset, constructed by collecting the full list of public pages and groups followed by over 1,100 American users. This approach allows us to examine the potential information exposure of these users by analyzing hundreds of millions of posts from 2012 to 2023. We find that political content constitutes a modest 18% of a user's potential information diet, which is predominantly composed of lifestyle and entertainment topics. This aggregate view, however, masks a deeply stratified reality: we uncover significant and persistent disparities in the volume and ideological leaning of political content across age, gender, and racial lines. Furthermore, we quantify the porous boundaries between content categories, showing how political discourse frequently permeates non-political spaces. Leveraging the dataset's longitudinal nature, we also assess the impact of major platform interventions. We find that Meta's 2018 "Meaningful Social Interactions" update dramatically increased the share of political content by contracting the visibility of non-political posts. By providing a granular, decade-long map of potential information exposure, our study offers one of the first representative and longitudinal picture drawn from platform-independent data. Our findings underscore the critical need for researchers to measure exposure, not merely engagement, and to account for the significant volume of political content that circulates in non-political spaces.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes a decade-long corpus of posts (2012–2023) from public pages and groups followed by 1,100 U.S. Facebook users to characterize potential information diets. It reports that political content comprises 18% of this diet (with the remainder dominated by lifestyle and entertainment), documents persistent demographic disparities in political volume and ideological slant by age, gender, and race, shows political discourse leaking into non-political categories, and finds that Meta’s 2018 Meaningful Social Interactions update increased the political share by reducing non-political visibility.
Significance. If the measurement pipeline is sound, the work supplies one of the few long-horizon, platform-independent maps of potential exposure rather than engagement. The longitudinal span, the quantification of cross-category leakage, and the before/after assessment of a major platform change are genuine strengths that could inform both academic and policy discussions on information environments.
major comments (3)
- [§3] §3 (Data and Methods): the manuscript provides no description of the classifier or labeling procedure used to designate posts as political versus lifestyle/entertainment. Without precision, recall, or inter-annotator details for this step, the headline 18% figure and all subsequent demographic contrasts rest on an opaque measurement whose error structure is unknown.
- [§3 and §5] §3 and §5: the central claim that the collected public-page/group list constitutes a representative proxy for each user’s “potential information diet” is asserted without external validation or sensitivity checks. The paper does not quantify the share of actual exposure that occurs via private groups, friend reshares, or algorithmic recommendations outside the followed set; if political material is over-represented in those channels, both the aggregate 18% and the reported age/gender/race gaps could be systematically misestimated.
- [§4.2] §4.2 (Demographic stratification): the text does not report how self-reported or inferred demographic variables were coded, cleaned, or controlled for confounding (e.g., differential page-following rates by age). The absence of these operational details makes it impossible to assess whether the observed disparities are robust to alternative codings or selection corrections.
minor comments (2)
- [Abstract] Abstract: the phrase “one of the first representative and longitudinal picture” is grammatically awkward; consider rephrasing for clarity.
- [Discussion] The paper would benefit from an explicit comparison table placing the 18% political share against prior estimates from engagement-based or survey-based studies.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address each major comment below and indicate the revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: §3 (Data and Methods): the manuscript provides no description of the classifier or labeling procedure used to designate posts as political versus lifestyle/entertainment. Without precision, recall, or inter-annotator details for this step, the headline 18% figure and all subsequent demographic contrasts rest on an opaque measurement whose error structure is unknown.
Authors: We agree that the current manuscript lacks sufficient detail on the classification procedure. In the revised version we will expand Section 3 to describe the classifier, the labeling protocol, precision and recall metrics, and inter-annotator agreement statistics so that the reliability of the 18% figure and downstream contrasts can be properly evaluated. revision: yes
-
Referee: §3 and §5: the central claim that the collected public-page/group list constitutes a representative proxy for each user’s “potential information diet” is asserted without external validation or sensitivity checks. The paper does not quantify the share of actual exposure that occurs via private groups, friend reshares, or algorithmic recommendations outside the followed set; if political material is over-represented in those channels, both the aggregate 18% and the reported age/gender/race gaps could be systematically misestimated.
Authors: Our data consist exclusively of posts from the public pages and groups followed by the panelists, which we treat as a proxy for potential exposure through those channels. We cannot quantify the share of exposure occurring via private groups, friend reshares, or recommendations outside the followed set because those data are not available to us. In the revision we will add an explicit discussion of this scope limitation together with any sensitivity checks that can be performed with the existing data. revision: partial
-
Referee: §4.2 (Demographic stratification): the text does not report how self-reported or inferred demographic variables were coded, cleaned, or controlled for confounding (e.g., differential page-following rates by age). The absence of these operational details makes it impossible to assess whether the observed disparities are robust to alternative codings or selection corrections.
Authors: We will revise Section 4.2 to document the coding and cleaning procedures for all demographic variables and will add controls for potential confounders such as differential page-following rates by age. Robustness checks under alternative codings and specifications will also be reported. revision: yes
- Quantifying the share of actual exposure from private groups, friend reshares, or algorithmic recommendations outside the followed set, because the dataset contains only public followed pages and groups.
Circularity Check
No circularity: purely observational data analysis
full rationale
The paper performs direct empirical computation on a collected dataset of followed public pages/groups and their posts. Quantities such as the 18% political share and demographic disparities are simple aggregates and stratifications of the observed post corpus; no equations, fitted parameters, or predictions are defined in terms of themselves. No self-citation chains, ansatzes, or uniqueness theorems are invoked to derive the central results. The study is self-contained against its own data collection protocol.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The 1,100-user sample is sufficiently representative of American Facebook users for generalizing demographic patterns.
- domain assumption Posts from followed public pages and groups constitute the relevant potential information diet.
Reference graph
Works this paper leans on
-
[1]
Jennifer Allen, Baird Howland, Markus Mobius, David Rothschild, and Duncan J Watts. 2020. Evaluating the fake news problem at the scale of the information ecosystem.Science advances(2020)
2020
-
[2]
Lisa P Argyle, Ethan C Busby, Joshua R Gubler, Christopher Rytting, and David Wingate. 2023. Out of one, many: Using language models to simulate human samples.Political Analysis(2023)
2023
-
[3]
Eytan Bakshy, Solomon Messing, and Lada A Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook.Science348, 6239 (2015), 1130–1132
2015
-
[4]
Pablo Barberá. 2020. Social media, echo chambers, and political polarization.Social media and democracy: The state of the field, prospects for reform (2020), 34–55. Manuscript submitted to ACM 16 Zaman et al
2020
-
[5]
Kay H Brodersen, Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L Scott. 2015. Inferring causal impact using Bayesian structural time-series models.Annals of Applied Statistics(2015)
2015
-
[6]
Benjamin Burroughs. 2014. Facebook and FarmVille: A digital ritual analysis of social gaming.Games and Culture(2014)
2014
-
[7]
2024.Changing partisan coalitions in a politically divided nation
Pew Research Center. 2024.Changing partisan coalitions in a politically divided nation. Pew Research Center
2024
-
[8]
Pew Research Center. 2024. The political values of Harris and Trump supporters. https://www.pewresearch.org/politics/2024/08/26/the-political- values-of-harris-and-trump-supporters/. August 26, 2024
2024
-
[9]
Matthew Costello, James Hawdon, Thomas Ratliff, and Tyler Grantham. 2016. Who views online extremism? Individual attributes leading to exposure.Computers in human behavior63 (2016), 311–320
2016
-
[10]
Daniel DellaPosta, Yongren Shi, and Michael Macy. 2015. Why do liberals drink lattes?Amer. J. Sociology120, 5 (2015), 1473–1511
2015
-
[11]
Bosheng Ding, Chengwei Qin, Linlin Liu, Yew Ken Chia, Boyang Li, Shafiq Joty, and Lidong Bing. 2023. Is GPT-3 a Good Data Annotator?. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toro...
-
[12]
Gregory Eady et al. 2023. Exposure to the Russian Internet Research Agency foreign influence campaign on Twitter in the 2016 US election and its relationship to attitudes and voting behavior.Nature Communications(2023)
2023
-
[13]
Seth Flaxman, Sharad Goel, and Justin M Rao. 2016. Filter bubbles, echo chambers, and online news consumption.Public opinion quarterly80, S1 (2016), 298–320
2016
-
[14]
Emma Fraxanet, Andreas Kaltenbrunner, Fabrizio Germano, and Vicenç Gómez. 2025. Analyzing news engagement on Facebook: tracking ideological segregation and news quality in the Facebook URL dataset.EPJ Data Science14, 1 (2025), 73
2025
-
[15]
Deen Freelon. 2018. Computational research in the post-API age.Political Communication35, 4 (2018), 665–668
2018
-
[16]
Suyash Fulay, William Brannon, Shrestha Mohanty, Cassandra Overney, Elinor Poole-Dayan, Deb Roy, and Jad Kabbara. 2024. On the Relationship between Truth and Political Bias in Language Models. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for ...
2024
-
[17]
Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, and Michael Mathioudakis. 2018. Political discourse on social media: Echo chambers, gatekeepers, and the price of bipartisanship. InProceedings of the 2018 world wide web conference. 913–922
2018
-
[18]
Jeffrey Gottfried. 2023. Americans’ Social Media Use — pewresearch.org. https://www.pewresearch.org/internet/2024/01/31/americans-social- media-use/. [Accessed 07 May 2024]
2023
-
[19]
Andrew Guess, Jonathan Nagler, and Joshua Tucker. 2019. Less than you think: Prevalence and predictors of fake news dissemination on Facebook. Science advances5, 1 (2019), eaau4586
2019
-
[20]
Andrew M Guess, Neil Malhotra, Jennifer Pan, Pablo Barberá, Hunt Allcott, Taylor Brown, Adriana Crespo-Tenorio, Drew Dimmery, Deen Freelon, Matthew Gentzkow, et al. 2023. How do social media feed algorithms affect attitudes and behavior in an election campaign?Science381, 6656 (2023), 398–404
2023
-
[21]
Jonathan Heawood. 2018. Pseudo-public political speech: Democratic implications of the Cambridge Analytica scandal.Information polity23, 4 (2018), 429–434
2018
-
[22]
Sara B Hobolt, Katharina Lawall, and James Tilley. 2024. The polarizing effect of partisan echo chambers.American Political Science Review118, 3 (2024), 1464–1479
2024
-
[23]
Homa Hosseinmardi, Amir Ghasemian, Miguel Rivera-Lanas, Manoel Horta Ribeiro, Robert West, and Duncan J Watts. 2024. Causally estimating the effect of YouTube’s recommender system using counterfactual bots.PNAS(2024)
2024
-
[24]
Florian Keusch, Paulina K Pankowska, Alexandru Cernat, and Ruben L Bach. 2023. Do you have two minutes to talk about your data? Willingness to participate and nonparticipation bias in Facebook data donation.Field Methods(2023)
2023
-
[25]
Mike Isaac Kevin Roose. 2021. Facebook Dials Down the Politics for Users (Published 2021) — nytimes.com. https://www.nytimes.com/2021/02/10/ technology/facebook-reduces-politics-feeds.html. [Accessed 06-10-2025]
2021
-
[26]
David Lazer, Mauricio Santillana, Roy H Perlis, Alexi Quintana, Katherine Ognyanova, Jonathan Green, Matthew A Baum, Matthew Simonson, Ata A Uslu, Hanyu Chwe, et al. 2020. The COVID States Project: A 50-state COVID-19 survey report# 26: Trajectory of COVID-19-related behaviors. COVID States Project(2020)
2020
-
[27]
Meta. 2024. Llama-3.3-70B-Instruct. https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct. Release date: December 6, 2024
2024
-
[28]
Meta Platforms
Inc. Meta Platforms. 2024. Meta Transparency Center. https://transparency.meta.com. Accessed: 2025-09-15
2024
-
[29]
Adam Mosseri. 2016. Building a Better News Feed for You. https://about.fb.com/news/2016/06/building-a-better-news-feed-for-you/. Facebook Newsroom
2016
-
[30]
Facebook
Adam Mosseri and Inc. Facebook. 2018. News Feed FYI: Bringing People Closer Together. https://about.fb.com/news/2018/01/news-feed-fyi- bringing-people-closer-together/. Facebook’s announcement of their 2018 News Feed algorithm change prioritizing meaningful social interactions (MSI)
2018
-
[31]
Jakob Ohme, Theo Araujo, Laura Boeschoten, Deen Freelon, Byron B Reeves, and Thomas N Robinson. 2023. Digital trace data collection for social media effects research: APIs, data donation, and (screen) tracking.Communication Methods and Measures(2023)
2023
-
[32]
OpenAI. 2025. Introducing GPT-5. https://openai.com/index/introducing-gpt-5/. Manuscript submitted to ACM Demographic Divides in Political Content Exposure on Facebook 17
2025
-
[33]
Pew Research Center. 2022. Politics on Twitter: One-Third of Tweets From U.S. Adults Are Political. https://www.pewresearch.org/politics/2022/06/ 16/politics-on-twitter-one-third-of-tweets-from-u-s-adults-are-political/
2022
-
[34]
Yair Rubinstein. 2025. Meta Content Library as a Research Tool. InAdjunct Proceedings of the 36th ACM Conference on Hypertext and Social Media (HT Adjunct ’25). Association for Computing Machinery, New York, NY, USA, 54. doi:10.1145/3720533.3756893
- [35]
-
[36]
Mubashir Sultan, Alan N Tump, Nina Ehmann, Philipp Lorenz-Spreen, Ralph Hertwig, Anton Gollwitzer, and Ralf HJM Kurvers. 2024. Susceptibility to online misinformation: A systematic meta-analysis of demographic and psychological factors.Proceedings of the National Academy of Sciences 121, 47 (2024), e2409329121
2024
- [37]
- [38]
-
[39]
Tess. 2024. What data is CrowdTangle tracking? | CrowdTangle Help Center — help.crowdtangle.com. https://help.crowdtangle.com/en/articles/ 1140930-what-data-is-crowdtangle-tracking. [Accessed 15 May 2024]
2024
-
[40]
meaningful social interactions
Brian E Weeks, Daniel S Lane, Dam Hee Kim, Slgi S Lee, and Nojin Kwak. 2017. Incidental exposure, selective exposure, and political information sharing: Integrating online exposure patterns and expression on social media.Journal of computer-mediated communication22, 6 (2017), 363–379. A Appendix A.1 Proportion of content from groups and pages on Facebook ...
2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.