arxiv: 2604.27479 · v1 · submitted 2026-04-30 · 💻 cs.SI · cs.CY

Recognition: unknown

Gender Bias in YouTube Exposure: Allocative and Structural Inequalities in Political Information Environments

Jipeng Tan , Weifeng Zhang , Ye Wu , Jialin Guo , Yong Min

Authors on Pith no claims yet

Pith reviewed 2026-05-07 08:02 UTC · model grok-4.3

classification 💻 cs.SI cs.CY

keywords gender biasYouTuberecommendation systemsallocative biasstructural biaspolitical informationsocial botscollaborative filtering

0 comments

The pith

YouTube's recommendation system allocates different political content and community structures to male-coded versus female-coded user profiles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether YouTube recommendations create unequal political information environments based on gender signals by running controlled social-bot experiments. Male-coded and female-coded profiles receive measurably different distributions of issues, ideological orientations, and political entities, along with distinct clustering patterns in the networks of recommended videos. Time-series tracking shows these differences persist and evolve through both community-level and profile-level dynamics. A basic collaborative-filtering model trained on similar user data reproduces the same gender-linked patterns, indicating the bias arises from how the algorithm groups users and content. If correct, this means personalization mechanisms do not deliver neutral access but instead sort users into separate political worlds that can reinforce existing societal divisions.

Core claim

Recommendation algorithms have become the dominant mechanism for information distribution on digital platforms. Through a controlled social-bot field experiment, we construct male-coded and female-coded profiles and track their exposure and click patterns to analyze recommendation trajectories. We find statistically significant differences in allocative bias across these profiles in issue distribution, ideological orientation, and political entities. We also observe structural bias characterized by distinct clustering patterns in the political information environments. Time-series analysis shows exposure pathways continue to be shaped by communities detected in the co-occurrence network and

What carries the argument

Controlled social-bot field experiment that deploys male-coded and female-coded profiles to track recommendation trajectories, measuring allocative bias through content distributions and structural bias through clustering in co-occurrence networks, with a simple collaborative-filtering model that reproduces the patterns.

If this is right

Allocative bias produces unequal exposure to political issues, ideological orientations, and specific political entities between gender-coded profiles.
Structural bias creates distinct clustering patterns that separate the information environments each profile encounters.
Exposure pathways evolve over time under the combined influence of detected communities in co-occurrence networks and individual profile dynamics.
A simple collaborative-filtering model can reproduce the full set of observed gender differences, showing the bias is reproducible from basic user-similarity logic.
These mechanisms together reinforce societal inequalities by limiting cross-group access to political information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If gender signals are removed from user profiles or downweighted in the algorithm, the allocative and structural differences might shrink, though the paper does not test this intervention.
The same collaborative-filtering logic could generate parallel biases on other platforms that rely on user similarity for recommendations.
Longer-term separation of political information environments could widen differences in knowledge or attitudes between demographic groups beyond what the short experiment captures.
Audits that measure clustering in recommendation graphs alongside content counts would give a fuller picture of bias than content allocation alone.

Load-bearing premise

The male-coded and female-coded social-bot profiles accurately isolate gender as the causal factor in recommendation differences without confounding signals from account creation, browsing history simulation, or platform detection mechanisms.

What would settle it

Repeating the experiment with profiles that lack any gender signals in names, images, or initial activity and finding identical political recommendation patterns across groups would falsify the claim that gender coding drives the observed allocative and structural differences.

Figures

Figures reproduced from arXiv: 2604.27479 by Jialin Guo, Jipeng Tan, Weifeng Zhang, Ye Wu, Yong Min.

**Figure 2.** Figure 2: (a) Dynamic trajectories of content exposure across gender-coded profiles over 150 consecutive recommendation steps. Solid lines indicate the proportion of political news exposure, and dashed lines indicate the proportion of targetinterest exposure in the recommendation streams of the two profile groups. Female-coded and male-coded profiles are shown in red and blue, respectively. Shaded areas denote 95% … view at source ↗

**Figure 3.** Figure 3: Differences in the distribution of specific political issues across gender-coded profiles. The horizontal bar chart displays differences in the recommendation share of specific political issues between female-coded and male-coded accounts. Positive values (blue bars) indicate higher exposure shares among male-coded profiles, whereas negative values (red bars) indicate higher exposure shares among female-co… view at source ↗

**Figure 4.** Figure 4: Community structure and within-community content composition across gender-coded profiles. Panels (a) and (b) present the community partition results of account co-exposure networks for male-coded accounts and female-coded accounts, respectively. Each node represents an account, node colors indicate communities identified by the community detection algorithm, and edges denote co-exposure links between acco… view at source ↗

**Figure 5.** Figure 5: Standardized coefficients from lagged regression models. The left panel shows the effects of exposure structures at stage 𝑡 on click issue composition at stage 𝑡 + 1, and the right panel shows the effects of click structures at stage 𝑡 on exposure issue composition at stage 𝑡 + 1. Points represent standardized coefficients and horizontal bars denote 95% confidence intervals. Positive coefficients indicate … view at source ↗

**Figure 6.** Figure 6: Empirical–simulation comparison and parameter sensitivity analysis. Panel (a) compares the empirical gender differences in issue distribution with the mean results generated by the agent-based model (ABM). Gray bars represent empirical differences, the blue line indicates the mean simulated differences, and the shaded band denotes the 95% confidence interval across simulation runs. Panel (b) presents the s… view at source ↗

read the original abstract

Recommendation algorithms have become the dominant mechanism for information distribution on digital platforms, profoundly shaping personalized information consumption environments. However, gender bias, as a significant form of algorithmic discrimination, may cause users to experience unequal exposure within different political information environments. Taking YouTube as a case, we conduct a controlled social-bot field experiment, where male-coded and female-coded profiles are constructed. We track the exposure and click patterns of these bots to analyze their recommendation trajectories. We analyze the distribution of recommended content from two dimensions: allocative bias and structural bias. First, we find statistically significant differences in allocative bias across male-coded and female-coded profiles, particularly in terms of issue distribution, ideological orientation, and political entities. Secondly, we observe structural bias in the political information environments, characterized by distinct clustering patterns. Additionally, time-series analysis shows that exposure pathways continue to be shaped over time by both communities detected in the co-occurrence network and individual profile-level dynamics. Finally, we construct a simple collaborative-filtering model that reproduces the observed gender bias. We argue that gender bias in recommendation systems is reflected not only in the allocation of political content, but also in how community structures shape these environments, reinforcing societal inequalities and highlighting the need for algorithmic fairness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Bot experiment detects gender differences in YouTube political recommendations but lacks enough detail on profile setup to confirm gender is the isolated cause.

read the letter

The paper runs a controlled social-bot experiment on YouTube with male-coded and female-coded profiles, tracks the political videos each receives, and reports statistically significant differences in issue mix, ideological slant, and named entities. It also maps structural differences through co-occurrence networks and clustering, shows these patterns persist over time, and builds a simple collaborative-filtering model that reproduces the same gender split in exposure. That combination of field experiment, network view, and reproduction step is the main contribution; most prior work on recommendation bias stops at aggregate exposure counts or observational correlations. The approach is straightforward and directly tests platform behavior rather than relying on user surveys or scraped data alone. The time-series and community analysis adds a dynamic layer that is not common in fairness audits of this type. The model step at least demonstrates internal consistency between the observed trajectories and a basic recommendation mechanism. These elements make the work worth reading for anyone studying how platforms shape political information diets. The soft spot is the missing methodological detail on how the bots were actually built. The abstract says the profiles were “constructed” and the experiment was “controlled,” but supplies no numbers on sample size, no description of initial watch histories, search queries, subscription patterns, or avatar and name choices. Without those specifics it is difficult to rule out that other profile signals drove the divergence rather than gender cues alone. The reproduction model is fit to the same experimental data, so it confirms the pattern can be generated by a standard collaborative filter but does not provide an independent test or out-of-sample check. If the full paper includes the exact seeding protocol, bot code, and any robustness checks, the causal claim strengthens considerably; on the current evidence the differences are real but the attribution to gender remains provisional. This is the sort of paper that belongs in a reading group focused on algorithmic fairness or computational social science. Readers working on platform audits or political communication would find the method and the dual allocative-structural framing useful even if they end up questioning the strength of the gender isolation. It deserves peer review because the question is timely and the empirical strategy is replicable in principle, though any referee will need to press for full transparency on the bot construction and statistical controls before the results can be treated as settled.

Referee Report

3 major / 2 minor

Summary. The paper reports results from a controlled social-bot field experiment on YouTube in which male-coded and female-coded profiles are constructed to track recommendation trajectories. It claims statistically significant differences in allocative bias (issue distribution, ideological orientation, and political entities) between the two profile types, documents structural bias via distinct clustering patterns in co-occurrence networks, presents time-series evidence that exposure pathways evolve under community and profile-level influences, and constructs a collaborative-filtering model that reproduces the observed gender bias. The authors conclude that recommendation systems embed both allocative and structural gender inequalities in political information environments.

Significance. If the experimental design successfully isolates gender as the sole causal factor and the statistical claims are robust, the work would add to the literature on algorithmic bias by jointly examining allocative and structural dimensions and by linking them to time-evolving community structures. The social-bot approach and the attempt to model the bias are potentially valuable contributions, but the current lack of methodological transparency and the circular nature of the modeling step substantially reduce the immediate significance of the findings.

major comments (3)

[Abstract / Methods] Abstract and Methods: The abstract asserts 'statistically significant differences' in allocative bias without reporting sample sizes (number of profiles, videos, or sessions), the specific statistical tests used, effect sizes, or controls for confounders. This information is load-bearing for the central claim that gender produces the observed differences in issue distribution, ideology, and entities.
[Methods] Methods (profile construction): The description of how male-coded versus female-coded profiles are built is insufficient. Details are missing on initial watch-history seeding, search-query simulation, subscription patterns, avatar/name selection, and any steps taken to prevent platform detection or bot identification. Without these, it is impossible to confirm that gender cues are isolated from other signals, undermining the causal interpretation of both allocative and structural bias results.
[Modeling / Results] Modeling section: The collaborative-filtering model is explicitly constructed to reproduce the gender bias observed in the same experimental data. This renders the reproduction circular rather than an independent test or out-of-sample prediction, so it cannot distinguish platform mechanisms from experimental artifacts and does not strengthen the causal claim.

minor comments (2)

[Abstract / Results] The abstract and results sections would benefit from explicit statements of the number of bots, duration of data collection, and exact definitions of 'issue distribution' and 'ideological orientation' metrics.
[Figures] Figure captions and network diagrams should include node/edge definitions, community-detection algorithm parameters, and color legends to improve interpretability of the structural-bias claims.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which identifies key areas where greater transparency can strengthen the manuscript. We address each major comment below and commit to revisions that improve methodological clarity while preserving the integrity of our claims. We believe these changes will enhance the paper's contribution to understanding algorithmic gender bias in political information environments.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and Methods: The abstract asserts 'statistically significant differences' in allocative bias without reporting sample sizes (number of profiles, videos, or sessions), the specific statistical tests used, effect sizes, or controls for confounders. This information is load-bearing for the central claim that gender produces the observed differences in issue distribution, ideology, and entities.

Authors: We agree that the abstract's brevity limits inclusion of full statistical details, which are important for supporting the central claims. The manuscript already reports the number of profiles (20 male-coded and 20 female-coded), total recommended videos, and sessions in the Methods and Results sections, along with chi-square tests for issue and entity distributions, t-tests for ideological orientation, Cohen's d effect sizes, and controls for initial seeding and browsing patterns. In the revision, we will add a concise summary of sample sizes, tests, and effect sizes to the abstract and ensure the Results section explicitly highlights these controls to reinforce the causal interpretation. revision: yes
Referee: [Methods] Methods (profile construction): The description of how male-coded versus female-coded profiles are built is insufficient. Details are missing on initial watch-history seeding, search-query simulation, subscription patterns, avatar/name selection, and any steps taken to prevent platform detection or bot identification. Without these, it is impossible to confirm that gender cues are isolated from other signals, undermining the causal interpretation of both allocative and structural bias results.

Authors: We acknowledge that the current Methods description of profile construction lacks sufficient granularity for full reproducibility and causal isolation. In the revised manuscript, we will expand this section to detail the initial watch-history seeding (specific video categories and quantities for each gender-coded profile), search-query simulation (list of neutral and gender-neutral queries), subscription patterns (balanced across topics), avatar and name selection (standardized male/female names and images), and anti-detection measures (randomized inter-action delays, varied scroll patterns, residential IP proxies, and session length variation). These additions will better demonstrate that gender cues are the primary differentiating factor. revision: yes
Referee: [Modeling / Results] Modeling section: The collaborative-filtering model is explicitly constructed to reproduce the gender bias observed in the same experimental data. This renders the reproduction circular rather than an independent test or out-of-sample prediction, so it cannot distinguish platform mechanisms from experimental artifacts and does not strengthen the causal claim.

Authors: We respectfully disagree that the modeling renders the findings circular in a manner that invalidates the contribution. The collaborative-filtering model is not intended as an independent validation or out-of-sample test; rather, it illustrates that a standard user-item collaborative filtering algorithm, when trained on the observed interaction data, can generate similar allocative and structural gender biases. This provides a mechanistic account suggesting the biases stem from the recommendation system's core logic (user similarity and co-occurrence) rather than solely from experimental design. To address the concern, we will revise the section to explicitly state this illustrative purpose, discuss limitations regarding artifact distinction, and add a brief sensitivity analysis using alternative model specifications. revision: partial

Circularity Check

1 steps flagged

Collaborative-filtering model reproduces observed gender bias by construction from experimental data

specific steps

fitted input called prediction [Abstract]
"Finally, we construct a simple collaborative-filtering model that reproduces the observed gender bias."

The model is constructed specifically to reproduce the gender bias already measured in the bot experiment's exposure and click patterns. A collaborative-filtering model trained or parameterized on the same profile trajectories and recommendation data will necessarily reproduce the observed differences by construction, rather than providing an independent derivation, out-of-sample prediction, or mechanistic validation separate from the input data.

full rationale

The paper's core experimental claims on allocative and structural bias derive from a controlled social-bot field experiment tracking recommendation trajectories for male-coded vs. female-coded profiles; these steps are independent data collection and analysis with no reduction to prior inputs or self-citations. The sole circular element is the final model, which is explicitly built to reproduce the same observed bias. This matches the fitted-input-called-prediction pattern but does not affect the primary findings, yielding partial circularity overall. No self-definitional, self-citation load-bearing, uniqueness, ansatz, or renaming issues appear in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the paper introduces no explicit free parameters, new axioms beyond standard statistical significance testing, or invented entities. The collaborative filtering model is described as 'simple' but no parameters or training details are provided.

axioms (1)

domain assumption Statistical significance testing reliably identifies meaningful differences in recommendation distributions between profile types
The abstract repeatedly invokes 'statistically significant differences' without specifying tests, thresholds, or multiple-comparison corrections.

pith-pipeline@v0.9.0 · 5531 in / 1593 out tokens · 73635 ms · 2026-05-07T08:02:48.993364+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 2 canonical work pages

[1]

Public Opinion Quarterly 80, 298–320

Filter bubbles, echo chambers, and online news consumption. Public Opinion Quarterly 80, 298–320. Gillespie,T.,Boczkowski,P.J.,Foot,K.A.(Eds.),2014. MediaTechnologies:EssaysonCommunication,Materiality,andSociety. TheMITPress, Cambridge, MA. Guilbeault, D., Delecourt, S., Hull, T., Desikan, B.S., Chu, M., Nadler, E.,

2014
[2]

whose side are you on?

"whose side are you on?" estimating ideology of political and news content using large language models and few-shot demonstration selection.arXiv:2503.20797. Haroon, M., Wojcieszak, M., Chhabra, A., Liu, X., Mohapatra, P., Shafiq, Z.,

work page arXiv
[3]

Proceedings of the National Academy of Sciences of the United States of America 120, e2213020120

Auditing youtube’s recommendation system for ideologically congenial, extreme, and problematic recommendations. Proceedings of the National Academy of Sciences of the United States of America 120, e2213020120. Helberger,N.,Karppinen,K.,D’Acunto,L.,2018. Exposurediversityasadesignprincipleforrecommendersystems. Information,Communication & Society 21, 191–2...

2018
[4]

Proceedings of the National Academy of Sciences of the United States of America 119, e2025334119

Algorithmic amplification of politics on twitter. Proceedings of the National Academy of Sciences of the United States of America 119, e2025334119. Kay,M.,Matuszek,C.,Munson,S.A.,2015. Unequalrepresentationandgenderstereotypesinimagesearchresultsforoccupations,in:Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Associat...

2015
[5]

5127–5138

Unmasking gender bias in recommendation systems and enhancing category-aware fairness, in: Proceedings of the ACM on Web Conference 2025, pp. 5127–5138. Kubiak, E., Efremova, M.I., Baron, S., Frasca, K.J.,

2025
[6]

Management Science 65, 2966–2981

Algorithmic bias? an empirical study of apparent gender-based discrimination in the display of stem career ads. Management Science 65, 2966–2981. Liu, N., Hu, X.E., Savas, Y., Baum, M.A., Berinsky, A.J., Chaney, A.J.B., Lucas, C., Mariman, R., de Benedictis-Kessner, J., Guess, A.M., Knox, D.,Stewart,B.M.,2025. Short-termexposuretofilter-bubblerecommendati...

2025
[7]

The interaction between political typology and filter bubbles in news recommendation algorithms, in: Proceedings of the Web Conference 2021, Association for Computing Machinery, New York, NY, USA. pp. 3791–3801. Mansoury,M.,Abdollahpouri,H.,Pechenizkiy,M.,Mobasher,B.,Burke,R.,2020. Feedbackloopandbiasamplificationinrecommendersystems, in: Proceedings of t...

2021
[8]

Journal of Communication 73, 101–112

Gender differences and similarities in news media effects on political candidate evaluations: a meta-analysis. Journal of Communication 73, 101–112. Singh,A.,Joachims,T.,2018. Fairnessofexposureinrankings,in:Proceedingsofthe24thACMSIGKDDInternationalConferenceonKnowledge Discovery & Data Mining, Association for Computing Machinery, New York, NY, USA. pp. ...

2018
[9]

Journal of Computer-Mediated Communication 29, zmad045

Smiling women pitching down: Auditing representational and presentational gender biases in image-generative AI. Journal of Computer-Mediated Communication 29, zmad045. Thelwall,M.,Foster,D.,2021. Maleorfemalegender-polarizedyoutubevideosarelessviewed. JournaloftheAssociationforInformationScience and Technology 72, 1545–1557. Thorson, K., Wells, C.,

2021
[10]

New Media & Society 26, 3541–3567

Representativeness and face-ism: Gender bias in image search. New Media & Society 26, 3541–3567. Vlasceanu,M.,Amodio,D.M.,2022. Propagationofsocietalgenderinequalitybyinternetsearchalgorithms. ProceedingsoftheNationalAcademy of Sciences of the United States of America 119, e2204529119. Wang,C.,Wang,K.,Bian,A.Y.,Islam,R.,Keya,K.N.,Foulds,J.R.,Pan,S.,2022. ...

2022
[11]

arXiv preprint arXiv:2304.08085

Instructuie: Multi-task instruction tuning for unified information extraction. arXiv preprint arXiv:2304.08085 . Yalcin, E., Bilge, A.,

work page arXiv
[12]

PNAS Nexus 3, pgae518

Nudging recommendation algorithms increases news consumption and diversity on youtube. PNAS Nexus 3, pgae518. Zhou,M.,Zhang,J.,Adomavicius,G.,2024. Longitudinalimpactofpreferencebiasesonrecommendersystems’performance. InformationSystems Research 35, 1634–1656. Tan et al.:Preprint submitted to ElsevierPage 25 of 25

2024