pith. machine review for the scientific record. sign in

arxiv: 2605.06999 · v1 · submitted 2026-05-07 · 💻 cs.SI

Recognition: no theorem link

TubeCensus: A Transparent, Replicable, and Large-Scale Census of YouTube Channels and their Subscriber Counts Over Time

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:14 UTC · model grok-4.3

classification 💻 cs.SI
keywords YouTubeInternet Archivelongitudinal datasetcreator economysubscriber countssocial media censusreplicable datachannel growth
0
0 comments X

The pith

TubeCensus builds a historical record of YouTube channels and subscriber counts by linking two decades of archived page captures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs TubeCensus to give researchers a transparent, time-stamped view of YouTube creators and their audience sizes without relying on the platform's official API. This matters because the API restricts full access to creators and metadata, leaving gaps in understanding how the creator economy works and how platform changes affect incentives. By harvesting and connecting public Internet Archive records, the dataset covers creators behind at least 30 to 36 percent of all YouTube content and performs well for prominent channels. The authors package the results in a simple pip tool so others can use the data directly for studies of channel growth and content patterns.

Core claim

TubeCensus organizes nearly twenty years of YouTube page captures from the Internet Archive into a longitudinal dataset of channels and subscriber counts. This construction is fully transparent and replicable, avoids any interaction with the official YouTube API, and achieves coverage of creators behind 30-36 percent of platform content while including most prominent ones.

What carries the argument

The collection, linking, and organization of Internet Archive captures of YouTube pages into historical channel and subscriber records.

If this is right

  • Researchers gain access to time-series subscriber data for studying how creator audiences evolve in response to platform algorithm updates.
  • The same public archive sources can be used by others to replicate or extend the census without depending on changing API outputs.
  • Initial analysis of channel content types and growth mechanisms becomes possible at a scale that covers a meaningful portion of the platform.
  • The pip package allows direct use of the cleaned dataset while hiding the details of YouTube identifiers and capture linking.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same archive-linking method could be adapted to build comparable historical datasets for other platforms that have been regularly captured by web archives.
  • Combining TubeCensus subscriber histories with separate video metadata could support tests of whether specific content formats drive sustained audience growth.
  • Periodic updates to the dataset would let researchers track ongoing changes in the creator landscape as new channels appear and older ones evolve.

Load-bearing premise

The Internet Archive captures of YouTube pages are complete enough and can be accurately linked to specific channels across different times without major gaps or identifier errors.

What would settle it

Finding that a substantial share of high-view or high-subscriber channels identified in independent sources are missing from TubeCensus or show mismatched subscriber histories would challenge the coverage and accuracy claims.

Figures

Figures reproduced from arXiv: 2605.06999 by Abram Handler, Chloe Eggleston, Maria Leonor Pacheco.

Figure 1
Figure 1. Figure 1: The API for the TUBE-CENSUS package. In total, limitations in current resources introduce a fun￾damental gap in our ability to observe the YouTube creator economy across time. This has important implications for our understanding of platforms, the mass media ecosystem, and YouTube itself. Because researchers do not have a ba￾sic inventory of the channels on YouTube or their histori￾cal subscriber counts, i… view at source ↗
Figure 2
Figure 2. Figure 2: A schema diagram illustrating the data resources [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Number of YouTube channel page captures on the Wayback Machine for each URL format per year. [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Capture count rank-frequency plot of a random [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The clusters correspond to distinct content gen [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Examples of typical S-curve growth of YouTube [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Examples of second channels overlaid under each [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Examples of anomalous growth associated with [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Channel growth (subscribers) over time (from 2006 through 2016) for top 75 most subscribed channels through 2013. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
read the original abstract

YouTube is central to contemporary mass media. However, the official YouTube API does not provide access to the full set of creators or creator metadata on the platform. This lack of basic visibility into the YouTube ecosystem hinders understanding of the platform's creator economy. Researchers currently have no easy, transparent, or replicable way to construct large-scale datasets of YouTube creators and their audiences over time. This makes it challenging to study vital social questions, such as how changes to the YouTube recommendation algorithm shape creator incentives and by extension the mass media on the platform. We address this gap with TubeCensus, a large-scale longitudinal dataset of YouTube creators and subscriber counts, constructed by collecting, linking, and organizing nearly two decades of YouTube page captures from the Internet Archive. This approach is transparent and replicable and does not require interaction with the YouTube API, whose output can change over time. We validate the coverage of TubeCensus against prior estimates of YouTube's size and find that our resource includes creators responsible for at least 30-36% of all YouTube content. We also find that TubeCensus provides good coverage of prominent creators. To support future research, we hide the substantial complexities of the YouTube identifier system and Internet Archive capture system by distributing our dataset via an easy-to-use pip package. Finally, we use our resource to complete basic exploratory analysis of YouTube channel content and the mechanisms associated with YouTube channel growth.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript introduces TubeCensus, a longitudinal dataset of YouTube channels and subscriber counts constructed by collecting, linking, and organizing nearly two decades of YouTube page captures from the Internet Archive. It claims to cover creators responsible for at least 30-36% of all YouTube content with good coverage of prominent creators, distributes the resource via an easy-to-use pip package that abstracts away identifier and capture complexities, and includes basic exploratory analysis of channel content and growth mechanisms.

Significance. If the coverage and linking claims hold, TubeCensus would be a valuable public resource for social science research on the YouTube creator economy, enabling replicable studies of platform dynamics without dependence on the official API. The explicit strengths are the transparent, API-free construction from public captures, the pip package for accessibility, and the focus on longitudinal subscriber data; these directly address the stated gap in visibility into creator incentives and mass media on the platform.

major comments (1)
  1. [Abstract] Abstract and validation description: the central 30-36% coverage claim (and the 'good coverage of prominent creators' statement) is load-bearing for the paper's contribution, yet the provided text supplies no methods, prior size estimates referenced, matching procedure for channel identifiers across snapshots, precision/recall, or sensitivity analysis for IA incompleteness and ID changes (usernames to UC... IDs). This prevents assessment of whether the percentage is robust or affected by systematic gaps.
minor comments (3)
  1. The manuscript should include a dedicated methods section (or subsection) detailing the linking algorithm, deduplication rules, and any exclusion criteria for captures or channels to support replicability claims.
  2. Clarify in the exploratory analysis section how subscriber counts are aggregated or interpolated across irregular IA snapshot dates, and whether any temporal alignment or normalization is applied.
  3. The pip package description would benefit from a short usage example or API reference in the main text or appendix to demonstrate how complexities are hidden for end users.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful review and for identifying the need for greater transparency around our coverage validation. We address the single major comment below and will incorporate the requested details into the revised manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract and validation description: the central 30-36% coverage claim (and the 'good coverage of prominent creators' statement) is load-bearing for the paper's contribution, yet the provided text supplies no methods, prior size estimates referenced, matching procedure for channel identifiers across snapshots, precision/recall, or sensitivity analysis for IA incompleteness and ID changes (usernames to UC... IDs). This prevents assessment of whether the percentage is robust or affected by systematic gaps.

    Authors: We agree that the abstract is too concise on validation and that the manuscript should make the supporting methods, estimates, and robustness checks explicit. The full paper contains a dedicated Validation section that compares TubeCensus to prior published estimates of total YouTube channels and content volume; we will add a one-sentence summary of those references and the resulting 30-36% range directly into the abstract. The channel-linking procedure resolves usernames to UC... IDs across snapshots and uses secondary metadata (titles, descriptions, and upload counts) to handle identifier changes; we will insert a brief description of this multi-identifier matching into both the abstract and the Validation section. We will also add (i) precision/recall figures obtained from manual annotation of a random sample of channels and (ii) a sensitivity analysis that varies the number of Internet Archive captures retained and reports the resulting coverage bounds. These additions will be included in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: data aggregation from external captures with external validation

full rationale

The paper describes construction of TubeCensus via collection, linking, and organization of Internet Archive YouTube page captures, followed by empirical validation of coverage against prior independent estimates of YouTube's size. No mathematical derivations, fitted parameters, predictions, or self-citations appear in the provided text that reduce any central claim to its own inputs by construction. The 30-36% coverage figure is presented as a direct measurement result rather than a self-referential output, and the methodology is framed as transparent and replicable using public external data. This is a standard data-resource paper with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on assumptions about archive completeness rather than new parameters or entities.

axioms (1)
  • domain assumption Internet Archive captures provide sufficient coverage and accurate linking for YouTube channel identifiers and subscriber counts over time.
    Invoked in the validation and construction sections to justify the 30-36% coverage claim.

pith-pipeline@v0.9.0 · 5575 in / 1261 out tokens · 35433 ms · 2026-05-11T01:14:15.754064+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

89 extracted references · 89 canonical work pages · 1 internal anchor

  1. [2]

    2025 , month = dec, day =

    The Oxford Word of the Year 2025 is rage bait , author =. 2025 , month = dec, day =

  2. [3]

    PLOS ONE , publisher =

    Tubes and bubbles topological confinement of YouTube recommendations , year =. PLOS ONE , publisher =. doi:10.1371/journal.pone.0231703 , author =

  3. [4]

    Interspeech , year=

    Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information , author=. Interspeech , year=

  4. [5]

    White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes

    Ribeiro, Manoel Horta and Ottoni, Raphael and West, Robert and Almeida, Virg\'. Auditing radicalization pathways on YouTube , year =. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , pages =. doi:10.1145/3351095.3372879 , abstract =

  5. [6]

    and Bisbee, James and Lai, Angela and Bonneau, Richard and Nagler, Jonathan and Tucker, Joshua A

    Brown, Megan A. and Bisbee, James and Lai, Angela and Bonneau, Richard and Nagler, Jonathan and Tucker, Joshua A. , title =. 2022 , howpublished =

  6. [7]

    , title =

    Kingsley, Sara and Sinha, Proteeti and Wang, Clara and Eslami, Motahhare and Hong, Jason I. , title =. Proc. ACM Hum.-Comput. Interact. , month = nov, articleno =. 2022 , issue_date =. doi:10.1145/3555149 , abstract =

  7. [8]

    and Bisbee, James and Bonneau, Richard and Tucker, Joshua A

    Lai, Angela and Brown, Megan A. and Bisbee, James and Bonneau, Richard and Tucker, Joshua A. and Nagler, Jonathan , title =. 2022 , howpublished =

  8. [9]

    1964 , publisher=

    Diffusion of Innovations , author=. 1964 , publisher=

  9. [10]

    Recherches math

    Verhulst, Pierre Fran. Recherches math. 1844 , journal=

  10. [11]

    Wilkinson, Mark D. and Dumontier, Michel and Aalbersberg, IJsbrand Jan and Appleton, Gabrielle and Axton, Myles and Baak, Arie and Blomberg, Niklas and Boiten, Jan-Willem and da Silva Santos, Luiz Bonino and Bourne, Philip E. and Bouwman, Jildau and Brookes, Anthony J. and Clark, Tim and Crosas, Merc. The FAIR Guiding Principles for scientific data manage...

  11. [12]

    Proceedings of the International AAAI Conference on Web and Social Media , author=

    A Data-Driven Study of View Duration on YouTube , volume=. Proceedings of the International AAAI Conference on Web and Social Media , author=. 2021 , month=. doi:10.1609/icwsm.v10i1.14781 , abstractNote=

  12. [14]

    Proceedings of the International AAAI Conference on Web and Social Media , author=

    The YouTube Social Network , volume=. Proceedings of the International AAAI Conference on Web and Social Media , author=. 2021 , month=. doi:10.1609/icwsm.v6i1.14243 , abstractNote=

  13. [15]

    TMG Journal for Media History , year =

    Susan Aasman , title =. TMG Journal for Media History , year =. doi:10.18146/tmg.435 , url =

  14. [16]

    2021 , publisher =

    Causal Inference: The Mixtape , author =. 2021 , publisher =

  15. [17]

    2014 , publisher =

    Counterfactuals and Causal Inference: Methods and Principles for Social Research , author =. 2014 , publisher =. doi:10.1017/CBO9781107587991 , isbn =

  16. [18]

    Communication, Simulation, and Intelligent Agents: Implications of Personal Intelligent Machines for Medical Education

    Clancey, William J. Communication, Simulation, and Intelligent Agents: Implications of Personal Intelligent Machines for Medical Education. Proceedings of the Eighth International Joint Conference on Artificial Intelligence (IJCAI-83)

  17. [19]

    Classification Problem Solving

    Clancey, William J. Classification Problem Solving. Proceedings of the Fourth National Conference on Artificial Intelligence

  18. [20]

    , title =

    Robinson, Arthur L. , title =. 1980 , doi =. https://science.sciencemag.org/content/208/4447/1019.full.pdf , journal =

  19. [21]

    New Ways to Make Microcircuits Smaller---Duplicate Entry

    Robinson, Arthur L. New Ways to Make Microcircuits Smaller---Duplicate Entry. Science

  20. [22]

    Clancey and Glenn Rennels , abstract =

    Diane Warner Hasling and William J. Clancey and Glenn Rennels , abstract =. Strategic explanations for a diagnostic consultation system , journal =. 1984 , issn =. doi:https://doi.org/10.1016/S0020-7373(84)80003-6 , url =

  21. [23]

    and Rennels, Glenn R

    Hasling, Diane Warner and Clancey, William J. and Rennels, Glenn R. and Test, Thomas. Strategic Explanations in Consultation---Duplicate. The International Journal of Man-Machine Studies

  22. [24]

    Poligon: A System for Parallel Problem Solving

    Rice, James. Poligon: A System for Parallel Problem Solving

  23. [25]

    Transfer of Rule-Based Expertise through a Tutorial Dialogue

    Clancey, William J. Transfer of Rule-Based Expertise through a Tutorial Dialogue

  24. [26]

    The Engineering of Qualitative Models

    Clancey, William J. The Engineering of Qualitative Models

  25. [27]

    2017 , eprint=

    Attention Is All You Need , author=. 2017 , eprint=

  26. [28]

    Pluto: The 'Other' Red Planet

    NASA. Pluto: The 'Other' Red Planet

  27. [29]

    Most Visited Sites in January 2024

  28. [30]

    2017 , url =

    Cristos Goodrow , title =. 2017 , url =

  29. [31]

    2025 , note =

    The Most Famous Influencers in America , howpublished =. 2025 , note =

  30. [32]

    Chen and Brendan Nyhan and Jason Reifler and Ronald E

    Annie Y. Chen and Brendan Nyhan and Jason Reifler and Ronald E. Robertson and Christo Wilson , title =. Science Advances , volume =. 2023 , doi =. https://www.science.org/doi/pdf/10.1126/sciadv.add8080 , abstract =

  31. [33]

    2022 , publisher=

    Like, Comment, Subscribe: Inside YouTube's Chaotic Rise to World Domination , author=. 2022 , publisher=

  32. [34]

    The Eleventh International Conference on Learning Representations , year=

    Modeling content creator incentives on algorithm-curated platforms , author=. The Eleventh International Conference on Learning Representations , year=

  33. [35]

    and Mason, Winter A

    Bakshy, Eytan and Hofman, Jake M. and Mason, Winter A. and Watts, Duncan J. , title =. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining , pages =. 2011 , isbn =. doi:10.1145/1935826.1935845 , abstract =

  34. [36]

    Lovett , title =

    Nan Li and Avery Haviv and Mitchell J. Lovett , title =. Marketing Science , year =. doi:10.1287/mksc.2021.0242 , url =

  35. [37]

    Journal of Broadcasting & Electronic Media , volume =

    Jaeho Cho, Saifuddin Ahmed, Martin Hilbert, Billy Liu and Jonathan Luu , title =. Journal of Broadcasting & Electronic Media , volume =. 2020 , publisher =. doi:10.1080/08838151.2020.1757365 , URL =

  36. [38]

    Adam and Clutton, Peter and Klein, Colin , year=

    Alfano, Mark and Fard, Amir Ebrahimi and Carter, J. Adam and Clutton, Peter and Klein, Colin , year=. Technologically scaffolded atypical cognition: The case of YouTube’s Recommender System - Synthese , url=. SpringerLink , publisher=

  37. [39]

    Rothschild and Duncan J

    Homa Hosseinmardi and Amir Ghasemian and Aaron Clauset and Markus Mobius and David M. Rothschild and Duncan J. Watts , title =. Proceedings of the National Academy of Sciences , volume =. 2021 , doi =. https://www.pnas.org/doi/pdf/10.1073/pnas.2101967118 , abstract =

  38. [40]

    Unpublished manuscript, New York University

    Adolescent mood disorders since 2010: A collaborative review , author=. Unpublished manuscript, New York University. Retrieved , volume=

  39. [41]

    Unpublished manuscript, New York University , year=

    Social media and political dysfunction: A collaborative review , author=. Unpublished manuscript, New York University , year=

  40. [42]

    and Zuckerman, Ethan and Vallina-Rodriguez, Narseo and O'Connor, Brendan and Nithyanand, Rishab , title =

    Dunna, Arun and Keith, Katherine A. and Zuckerman, Ethan and Vallina-Rodriguez, Narseo and O'Connor, Brendan and Nithyanand, Rishab , title =. Proc. ACM Hum.-Comput. Interact. , month =. 2022 , issue_date =. doi:10.1145/3555209 , abstract =

  41. [43]

    Journal of Quantitative Description: Digital Media , author=

    Dialing for Videos: A Random Sample of YouTube , volume=. Journal of Quantitative Description: Digital Media , author=. 2023 , month=. doi:10.51685/jqd.2023.022 , abstractNote=

  42. [44]

    Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference , pages =

    Zhou, Jia and Li, Yanhua and Adhikari, Vijay Kumar and Zhang, Zhi-Li , title =. Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference , pages =. 2011 , isbn =. doi:10.1145/2068816.2068851 , abstract =

  43. [45]

    Animation , volume =

    Xavier Ribes , title =. Animation , volume =. 2020 , doi =. https://doi.org/10.1177/1746847720969990 , abstract =

  44. [46]

    Social Media + Society , year=

    Algorithmic Experts: Selling Algorithmic Lore on YouTube , author=. Social Media + Society , year=

  45. [47]

    Salganik and Peter Sheridan Dodds and Duncan J

    Matthew J. Salganik and Peter Sheridan Dodds and Duncan J. Watts , title =. Science , volume =. 2006 , doi =. https://www.science.org/doi/pdf/10.1126/science.1121066 , abstract =

  46. [48]

    2012 , url=

    Statistics of the Common Crawl Corpus 2012 , author=. 2012 , url=

  47. [49]

    and McGrady, R

    Zheng, K. and McGrady, R. and Zuckerman, E. , title =. 2023 , month =

  48. [50]

    The algorithm is like a mercurial god

    “The algorithm is like a mercurial god”: Exploring content creators’ perception of algorithmic agency on YouTube , author=. New Media & Society , year=

  49. [51]

    How it actually works

    “How it actually works”: Algorithmic lore videos as market devices , author=. New Media & Society , year=

  50. [52]

    Gummadi, Peter Druschel, and Bobby Bhattacharjee

    Mislove, Alan and Marcon, Massimiliano and Gummadi, Krishna P. and Druschel, Peter and Bhattacharjee, Bobby , title =. Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement , pages =. 2007 , isbn =. doi:10.1145/1298306.1298311 , abstract =

  51. [53]

    Emergence of scaling in random networks,

    Albert-László Barabási and Réka Albert , title =. Science , volume =. 1999 , doi =. https://www.science.org/doi/pdf/10.1126/science.286.5439.509 , abstract =

  52. [54]

    Political Communication , volume =

    Deen Freelon , title =. Political Communication , volume =. 2018 , publisher =. doi:10.1080/10584609.2018.1477506 , URL =

  53. [55]

    Social Media + Society , volume =

    Rebekah Tromble , title =. Social Media + Society , volume =. 2021 , doi =. https://doi.org/10.1177/2056305121988929 , abstract =

  54. [57]

    Information, Communication & Society , volume =

    Axel Bruns , title =. Information, Communication & Society , volume =. 2019 , publisher =. doi:10.1080/1369118X.2019.1637447 , URL =

  55. [58]

    Frontiers in Sociology , VOLUME=

    Trezza, Domenico , TITLE=. Frontiers in Sociology , VOLUME=. 2023 , URL=. doi:10.3389/fsoc.2023.1145038 , ISSN=

  56. [59]

    2008--2026 , note =

    SocialBlade: YouTube, Instagram, Twitch, TikTok, and More Statistics , howpublished =. 2008--2026 , note =

  57. [60]

    2015 , url =

    Jillian D’Onfro , title =. 2015 , url =

  58. [61]

    2012 , url =

    YouTube search, now optimized for time watched , howpublished =. 2012 , url =

  59. [62]

    2012 , url =

    Changes to Related and Recommended Videos , howpublished =. 2012 , url =

  60. [63]

    2015 , url =

    Max Mason , title =. 2015 , url =

  61. [64]

    2006 , url =

    Your 15 Minutes of Fame..ummm...Make that 10 Minutes or Less , howpublished =. 2006 , url =

  62. [65]

    2010 , url =

    Joshua Siegel , title =. 2010 , url =

  63. [66]

    Easy data, same old platforms?

    \". Easy data, same old platforms?. Information, Communication & Society , volume =. 2023 , publisher =. doi:10.1080/1369118X.2021.2013918 , url =

  64. [67]

    Aasman, S. 2019. Finding Traces in YouTube’s Living Archive: Exploring Informal Archival Practices. TMG Journal for Media History, 22(1): 35--55. Published November 6, 2019

  65. [68]

    Bergen, M. 2022. Like, Comment, Subscribe: Inside YouTube's Chaotic Rise to World Domination. Penguin

  66. [69]

    A.; Bisbee, J.; Lai, A.; Bonneau, R.; Nagler, J.; and Tucker, J

    Brown, M. A.; Bisbee, J.; Lai, A.; Bonneau, R.; Nagler, J.; and Tucker, J. A. 2022. Echo Chambers, Rabbit Holes, and Algorithmic Bias: How YouTube Recommends Content to Real Users. SSRN Working Paper. Posted May 11, 2022. Available at https://ssrn.com/abstract=4114905

  67. [70]

    Bruns, A. 2019. After the ‘APIcalypse’: social media platforms and their fight against critical scholarly research. Information, Communication & Society, 22(11): 1544--1566

  68. [71]

    Dinkov, Y.; Ali, A.; Koychev, I.; and Nakov, P. 2019. Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information. In Interspeech

  69. [72]

    A.; Zuckerman, E.; Vallina-Rodriguez, N.; O'Connor, B.; and Nithyanand, R

    Dunna, A.; Keith, K. A.; Zuckerman, E.; Vallina-Rodriguez, N.; O'Connor, B.; and Nithyanand, R. 2022. Paying Attention to the Algorithm Behind the Curtain: Bringing Transparency to YouTube's Demonetization Algorithms. Proc. ACM Hum.-Comput. Interact., 6(CSCW2)

  70. [73]

    Freelon, D. 2018. Computational Research in the Post-API Age. Political Communication, 35(4): 665--668

  71. [74]

    Goodrow, C. 2017. You know what’s cool? A billion hours. YouTube Blog, News & Events. Accessed: 2025-09-14

  72. [75]

    Grootendorst, M. 2022. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794

  73. [76]

    Give Everybody [..] a Little Bit More Equity

    Kingsley, S.; Sinha, P.; Wang, C.; Eslami, M.; and Hong, J. I. 2022. "Give Everybody [..] a Little Bit More Equity": Content Creator Perspectives and Responses to the Algorithmic Demonetization of Content Associated with Disadvantaged Groups. Proc. ACM Hum.-Comput. Interact., 6(CSCW2)

  74. [77]

    A.; Bisbee, J.; Bonneau, R.; Tucker, J

    Lai, A.; Brown, M. A.; Bisbee, J.; Bonneau, R.; Tucker, J. A.; and Nagler, J. 2022. Estimating the Ideology of Political YouTube Videos. SSRN Working Paper. Posted May 2, 2022. Available at https://ssrn.com/abstract=4088828

  75. [78]

    McGrady, R.; Zheng, K.; Curran, R.; Baumgartner, J.; and Zuckerman, E. 2023. Dialing for Videos: A Random Sample of YouTube. Journal of Quantitative Description: Digital Media, 3

  76. [79]

    P.; Druschel, P.; and Bhattacharjee, B

    Mislove, A.; Marcon, M.; Gummadi, K. P.; Druschel, P.; and Bhattacharjee, B. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, IMC '07, 29–42. New York, NY, USA: Association for Computing Machinery. ISBN 9781595939081

  77. [80]

    Oxford University Press . 2025. The Oxford Word of the Year 2025 is rage bait. Accessed: 2026-01-15

  78. [81]

    M.; Reilly, P

    \" O zkula, S. M.; Reilly, P. J.; and Hayes, J. 2023. Easy data, same old platforms? A systematic review of digital activism methodologies. Information, Communication & Society, 26(7): 1470--1489

  79. [82]

    Park, M.; Naaman, M.; and Berger, J. 2021. A Data-Driven Study of View Duration on YouTube. Proceedings of the International AAAI Conference on Web and Social Media, 10(1): 651--654

  80. [83]

    H.; Ottoni, R.; West, R.; Almeida, V

    Ribeiro, M. H.; Ottoni, R.; West, R.; Almeida, V. A. F.; and Meira, W. 2020. Auditing radicalization pathways on YouTube. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* '20, 131–141. New York, NY, USA: Association for Computing Machinery. ISBN 9781450369367

Showing first 80 references.