Brazilian Social Media Anti-vaccine Information Disorder Dataset -- Telegram (2020-2025)
Pith reviewed 2026-05-16 10:43 UTC · model grok-4.3
The pith
This paper releases a dataset of roughly four million Telegram posts from Brazilian anti-vaccine channels collected between 2020 and 2025.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper introduces a curated dataset of about four million Telegram posts collected from 119 prominent Brazilian anti-vaccine channels between 2020 and 2025, including message content, metadata, associated media, and classification related to vaccine posts, to enable examination of how false or misleading information spreads and influences public sentiment.
What carries the argument
The dataset of Telegram posts with content, metadata, media, and vaccine classifications that serves as the resource for analyzing misinformation patterns.
Load-bearing premise
The selected 119 channels represent the main sources of anti-vaccine content on Brazilian Telegram and the collection avoids selection bias or privacy violations.
What would settle it
Discovery of a large volume of anti-vaccine posts from Brazilian Telegram channels outside the 119 included ones would show the dataset does not fully capture the landscape.
Figures
read the original abstract
Over the past decade, Brazil has experienced a decline in vaccination coverage, reversing decades of public health progress achieved through the National Immunization Program (PNI). Growing evidence points to the widespread circulation of vaccine-related misinformation -- particularly on social media platforms -- as a key factor driving this decline. Among these platforms, Telegram remains the only major platform permitting accessible and ethical data collection, offering insight into public channels where vaccine misinformation circulates extensively. This data paper introduces a curated dataset of about four million Telegram posts collected from 119 prominent Brazilian anti-vaccine channels between 2020 and 2025. The dataset includes message content, metadata, associated media, and classification related to vaccine posts, enabling researchers to examine how false or misleading information spreads, evolves, and influences public sentiment. By providing this resource, our aim is to support the scientific and public health community in developing evidence-based strategies to counter misinformation, promote trust in vaccination, and engage compassionately with individuals and communities affected by false narratives. The dataset and documentation are openly available for non-commercial research, under strict ethical and privacy guidelines at https://doi.org/10.25824/redu/5JIVDT
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a dataset of approximately four million Telegram posts collected from 119 prominent Brazilian anti-vaccine channels spanning 2020-2025. It includes message content, metadata, associated media, and vaccine-related classifications, with the resource released openly under ethical and privacy guidelines at a specified DOI to support research on misinformation spread and its effects on vaccination coverage.
Significance. If the collection is representative, the dataset would offer a substantial, timely resource for studying vaccine-related information disorder on Telegram in Brazil, where declining immunization rates have been linked to social media content. The scale, multi-year window, inclusion of media, and open ethical release under non-commercial terms strengthen its potential utility for public health and computational social science research.
major comments (1)
- [Data collection] Data collection section: The manuscript states that posts come from 119 'prominent' Brazilian anti-vaccine channels but provides no explicit inclusion criteria (e.g., subscriber thresholds, search terms, activity filters, manual verification, or overlap with known channel lists). Without these details, systematic selection bias cannot be assessed, directly weakening the central claim that the corpus enables representative study of anti-vaccine content circulation.
minor comments (2)
- [Abstract] Abstract and methods: The description of 'classification related to vaccine posts' lacks detail on the labeling process, inter-annotator agreement, or validation steps; adding a brief summary would improve reproducibility.
- [Dataset availability] Dataset documentation: Confirm that the released materials include channel metadata (e.g., subscriber counts at collection time) and a clear statement of any temporal or geographic coverage limitations.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and recommendation for major revision. We agree that explicit documentation of channel selection is necessary to assess potential biases and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: The manuscript states that posts come from 119 'prominent' Brazilian anti-vaccine channels but provides no explicit inclusion criteria (e.g., subscriber thresholds, search terms, activity filters, manual verification, or overlap with known channel lists). Without these details, systematic selection bias cannot be assessed, directly weakening the central claim that the corpus enables representative study of anti-vaccine content circulation.
Authors: We acknowledge this limitation in the current version. The 119 channels were selected via a multi-step process: (1) initial identification using Telegram search with terms such as 'anti-vacina Brasil', 'vacina não', 'imunização falsa' and related Portuguese keywords; (2) filtering to channels with at least 5,000 subscribers and average monthly activity of 30+ posts during 2020-2025; (3) manual verification by two domain experts confirming primary focus on anti-vaccine content; and (4) cross-referencing against lists from Brazilian public health reports and prior misinformation studies. We will add a dedicated 'Channel Selection Criteria' subsection with these details, including the exact search strings, subscriber threshold rationale, verification protocol, and any channels excluded, enabling readers to evaluate selection bias and representativeness. revision: yes
Circularity Check
Data release paper exhibits no circularity
full rationale
The manuscript is a data paper that describes the curation and release of a Telegram corpus. No derivations, predictions, fitted parameters, or equations are presented anywhere in the text. The central contribution is the external availability of the dataset itself rather than any internal model or claim that reduces to its own inputs by construction. Channel selection is described at a high level but is not part of any derivation chain, so the absence of detailed inclusion rules constitutes a methodological limitation rather than circularity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Public Telegram channel data can be collected and shared for non-commercial research under ethical and privacy guidelines
Reference graph
Works this paper leans on
-
[1]
The Brazilian Health System at Crossroads: Progress, Crisis and Resilience
Adriano Massuda et al. “The Brazilian Health System at Crossroads: Progress, Crisis and Resilience”. In:BMJ Global Health3.4 (2018), e000829.doi:10.1136/bmjgh-2018-000829
-
[2]
Brasília, DF: Ministério da Saúde, 2003.url: https : / / bvsms
Ministério da Saúde.Programa Nacional de Imunizações: 30 Anos do PNI. Brasília, DF: Ministério da Saúde, 2003.url: https : / / bvsms . saude . gov . br / bvs / publicacoes / livro_30_anos_pni.pdf(visited on 10/19/2025)
work page 2003
-
[3]
Cristina Possas, João Baptista Risi, and Akira Homma. “Vaccine Coverage in the Tropics: Sharp Decline in Immunization and Implications for Disease X Preparedness and the UN 2030 Agenda”. In:Frontiers in Tropical Diseases5 (2024), p. 1441970.doi:10.3389/fitd. 2024.1441970
-
[4]
The Impact of COVID-19 on Routine Paediatric Vaccination Delivery in Brazil
Carolina Braga Moura et al. “The Impact of COVID-19 on Routine Paediatric Vaccination Delivery in Brazil”. In:Vaccine40.15 (2022), pp. 2292–2298.doi:10.1016/j.vaccine. 2022.02.076
-
[5]
Instituto Butantan.Como a Hesitação Vacinal Impactou a Rotina de Imunização no Brasil?2024.url: https://butantan.gov.br/noticias/como-a-hesitacao-vacinal- impactou-a-rotina-de-imunizacao-no-brasil(visited on 10/20/2025)
work page 2024
-
[6]
Adherence to COVID-19 Vaccination during the Pandemic: The Influence of Fake News
Luana Cristina Roberto Borges et al. “Adherence to COVID-19 Vaccination during the Pandemic: The Influence of Fake News”. In:Revista Brasileira de Enfermagem77.1 (2024), e20230284.doi:10.1590/0034-7167-2023-0284
-
[7]
Adriana Rodrigues da Cunha et al. “Impacto das Fake News sobre Vacinação na Mortalidade por COVID-19: Uma Análise Epidemiológica no Brasil”. In:Revista de Enfermagem da UFPI14.1 (2025), e6151.doi:10.26694/reufpi.v14i1.6151
-
[8]
Claire Wardle and Hossein Derakhshan.Information Disorder: Toward an Interdisci- plinary Framework for Research and Policymaking. Tech. rep. Council of Europe, 2017. url: https : / / edoc . coe . int / en / media / 7495 - information - disorder - toward - an - interdisciplinary - framework - for - research - and - policymaking . html(visited on 10/20/2025)
work page 2017
-
[9]
Eanes Torres Pereira, Sylvia Iasulaitis, and Bruno Cardoso Greco. “Analysis of Causal Relations between Vaccine Hesitancy for COVID-19 Vaccines and Ideological Orientations in Brazil”. In:Vaccine42.13 (2024), pp. 3263–3271.doi:10.1016/j.vaccine.2024.04.022
-
[10]
Fake News and Vaccine Hesitancy in the COVID-19 Pandemic in Brazil
Claudia Pereira Galhardi et al. “Fake News and Vaccine Hesitancy in the COVID-19 Pandemic in Brazil”. In:Ciência & Saúde Coletiva27.5 (2022), pp. 1849–1858.doi: 10.1590/1413-81232022275.24092021EN
-
[11]
Fake News Mediate the Relationship between Sociopolitical Factors and Vaccination Intent in Brazil
Priscila Muniz de Medeiros and Patrícia Muniz de Medeiros. “Fake News Mediate the Relationship between Sociopolitical Factors and Vaccination Intent in Brazil”. In:Health Promotion International37.6 (2022), daac110.doi:10.1093/heapro/daac110
- [12]
- [13]
-
[14]
Ana Carolina Pontalti Monari. “Who Consumes Fake News Consumes Fake News? The Uses and Meanings Attributed to Science and Journalism in Channels about COVID-19 on Telegram”. PhD thesis. Rio de Janeiro, Brazil: Institute of Communication, Scientific, and Technological Information in Health, Oswaldo Cruz Foundation (Fiocruz), 2024. 284 pp
work page 2024
-
[15]
2025.url:https://github.com/LonamiWebs/Telethon(visited on 03/09/2025)
LonamiWebs.Telethon: A Python 3 MTProto Library to Interact with Telegram’s API. 2025.url:https://github.com/LonamiWebs/Telethon(visited on 03/09/2025)
work page 2025
-
[16]
Shuyo Nakatani and contributors.langdetect: Language Detection Library for Python. 2020. url:https://pypi.org/project/langdetect/(visited on 10/17/2025)
work page 2020
-
[17]
org/faq_channels#q-what-does-the-eye-icon-mean(visited on 10/19/2025)
Telegram Messenger LLP.What Does the Eye Icon Mean?2025.url: https://telegram. org/faq_channels#q-what-does-the-eye-icon-mean(visited on 10/19/2025)
work page 2025
-
[18]
European Union Agency for Cybersecurity (ENISA).Pseudonymisation Techniques and Best Practices: Recommendations on Shaping Technology According to Data Protection and Privacy Provisions. Tech. rep. Publications Office of the European Union, 2019.url: https://www.enisa.europa.eu/publications/pseudonymisation- techniques- and- best-practices(visited on 10/19/2025)
work page 2019
-
[19]
2024.url: https://github.com/microsoft/presidio(visited on 10/17/2025)
Microsoft Corporation.Presidio: Data Protection and Anonymization SDK. 2024.url: https://github.com/microsoft/presidio(visited on 10/17/2025). 14
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.