The Tone of Awareness: Topic, Sentiment, and Toxicity Maps During Mental Health Month on TikTok
Pith reviewed 2026-06-27 05:14 UTC · model grok-4.3
The pith
Mental health videos on TikTok carry negative sentiment while comments shift toward positive polarity, especially on suicide prevention.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A stable collection of topics appears across both years, with engagement heavily skewed toward a few of them. Video sentiment is frequently negative for emotionally charged subjects, whereas comments move toward mixed or positive polarity, most clearly on suicide prevention. Toxicity scores stay low in the median but display heavier tails in comments than in videos, with the outliers clustered around particular topics.
What carries the argument
BERTopic topic extraction applied to video text, followed by separate XLM-T sentiment scoring and Detoxify toxicity scoring on video transcripts versus comments.
If this is right
- Engagement concentrates on a small number of topics across both years.
- Video content on emotionally charged topics tends to carry negative sentiment.
- Comment sentiment becomes more positive than video sentiment, particularly under suicide prevention topics.
- Toxicity outliers appear more often and more extremely in comments than in videos and cluster on specific topics.
Where Pith is reading between the lines
- Audience comments may soften the negative tone set by video creators on sensitive subjects.
- Awareness-month campaigns reach viewers unevenly because a few topics capture most interaction.
- Platform moderation could focus on the small set of topics that generate the longest toxicity tails.
- Repeating the same analysis on non-campaign periods would test whether the observed topic stability is unique to awareness months.
Load-bearing premise
The pre-trained sentiment and toxicity models measure the intended emotional and interpersonal framing of TikTok mental health text without needing domain-specific adjustment.
What would settle it
A hand-labeled sample of several hundred TikTok mental health videos and comments on which the XLM-T and Detoxify models disagree with human raters at rates above 30 percent would undermine the reported sentiment and toxicity patterns.
Figures
read the original abstract
Despite raising concerns about the mental health effects associated with the usage of TikTok, little is known about how related content is framed by creators and received by audiences. We collect the content of 28,341 TikTok videos and 80,130 comments from Mental Health Awareness Month (May) in 2023 and 2024 via the TikTok Research API, and study how the tone of awareness varies across topics and years. We characterize "tone" as the emotional and interpersonal framing of mental health discourse, operationalized through sentiment and toxicity measures. We extract topics from video text using BERTopic and log-odds keywords, then quantify topic-conditioned sentiment (XLM-T) and toxicity (Detoxify) separately for video transcriptions and comments. Sentiment captures the affective valence of content, while toxicity reflects the presence of harmful or abusive language. We find a stable set of recurring themes across years, spanning clinical conditions, emotional disclosure, self-care, and campaign-oriented content, with engagement highly skewed toward a small subset of topics. All sentiment and toxicity analyses are computed separately for video content and comments, allowing us to distinguish between content production and audience reception. Sentiment in videos is often negative for emotionally charged topics, while comments tend to shift toward more mixed or positive polarity, especially for suicide prevention. Toxicity is low in median overall, but exhibits longer-tailed outliers in comments than in videos that are more pronounced in comments and concentrated in specific topics (e.g., "Duet", "Suicide Prevention", and "Psychisch"). Overall, our results provide a topic-level decomposition of mental health discourse on TikTok during awareness-month campaigns.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper collects 28,341 TikTok videos and 80,130 comments posted during Mental Health Awareness Month (May) in 2023 and 2024 via the TikTok Research API. It applies BERTopic to extract topics from video transcripts, then computes topic-conditioned sentiment via the pre-trained XLM-T model and toxicity via Detoxify, separately on video text and comments. The central claims are that recurring themes (clinical conditions, emotional disclosure, self-care, campaign content) are stable across years with highly skewed engagement; video sentiment is often negative on emotionally charged topics while comments shift toward mixed/positive polarity (especially suicide prevention); and toxicity is low in median but shows longer-tailed outliers in comments, concentrated in specific topics.
Significance. If the off-the-shelf sentiment and toxicity scores prove reliable on this corpus, the work supplies a useful large-scale, topic-resolved map of mental-health discourse that distinguishes creator framing from audience reception and documents stability across two campaign years. The scale of the TikTok Research API sample and the separation of video vs. comment analyses are clear strengths for an empirical measurement study in this domain.
major comments (1)
- [Methods description of sentiment and toxicity quantification (abstract and corresponding methods section)] The central claims about negative video sentiment for charged topics, positive comment polarity shifts (especially suicide prevention), and topic-specific toxicity tails rest entirely on the outputs of the pre-trained XLM-T and Detoxify models applied without domain adaptation, human annotation validation, or error analysis on the TikTok mental-health corpus. TikTok text is short-form, emoji-laden, slang-heavy, and often ironic or multilingual; general-domain models frequently misalign with human judgments on such data. No section of the manuscript reports fine-tuning, inter-annotator agreement, or even a small held-out validation set against which model accuracy could be assessed.
Simulated Author's Rebuttal
We thank the referee for their detailed feedback on our manuscript. The central methodological concern regarding validation of the sentiment and toxicity models is addressed point-by-point below.
read point-by-point responses
-
Referee: [Methods description of sentiment and toxicity quantification (abstract and corresponding methods section)] The central claims about negative video sentiment for charged topics, positive comment polarity shifts (especially suicide prevention), and topic-specific toxicity tails rest entirely on the outputs of the pre-trained XLM-T and Detoxify models applied without domain adaptation, human annotation validation, or error analysis on the TikTok mental-health corpus. TikTok text is short-form, emoji-laden, slang-heavy, and often ironic or multilingual; general-domain models frequently misalign with human judgments on such data. No section of the manuscript reports fine-tuning, inter-annotator agreement, or even a small held-out validation set against which model accuracy could be assessed.
Authors: We agree that the manuscript does not report domain adaptation, human validation, or error analysis for the XLM-T and Detoxify models on this TikTok corpus. These off-the-shelf models were applied due to their established use in social media sentiment and toxicity detection, including short-form content, without fine-tuning to maintain reproducibility and scale. We acknowledge that TikTok-specific features such as emojis, slang, and irony may affect alignment with human judgments, and that the absence of a validation set is a limitation. We will revise the manuscript to add an explicit discussion of model applicability and potential limitations in the Methods section, along with a new Limitations subsection in the Discussion that references prior evaluations of these models on similar platforms. We cannot provide new human annotations or inter-annotator agreement metrics, as these were not collected in the original study. revision: partial
- Empirical validation of the XLM-T and Detoxify models via human annotation or error analysis on the TikTok mental-health corpus
Circularity Check
No circularity: empirical measurement study with direct application of external models
full rationale
The paper collects TikTok data and applies BERTopic for topics plus off-the-shelf XLM-T and Detoxify models for sentiment/toxicity. No equations, fitted parameters, derivations, or self-citations appear in the provided text or abstract. All reported patterns (stable themes, sentiment shifts, toxicity tails) are direct outputs of these external tools on the corpus, with no reduction of results to inputs by construction. This matches the default case of a self-contained empirical study.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pre-trained BERTopic, XLM-T, and Detoxify models produce reliable topic, sentiment, and toxicity labels when applied directly to TikTok mental health transcripts and comments.
Reference graph
Works this paper leans on
-
[1]
Over a billion people living with mental health conditions – services require urgent scale-up
WHO. Over a billion people living with mental health conditions – services require urgent scale-up. https://www.who.int/news/item/02-09-2025-over-a- billion-people-living-with-mental-health-conditions- services-require-urgent-scale-up, 2025. [Accessed 14-01-2026]
2025
-
[2]
Exploring problematic tiktok use and 9 Available at https://www.tiktok.com/legal/page/global/terms- of-service-research-api/en
Lakshit Jain, Luis Velez, Surya Karlapati, Mary Forand, Rajasekhar Kannali, Rao Ahmed Yousaf, Rizwan Ahmed, Zouina Sarfraz, Pearl A Sutter, Christian An- thony Tallo, et al. Exploring problematic tiktok use and 9 Available at https://www.tiktok.com/legal/page/global/terms- of-service-research-api/en. mental health issues: A systematic review of empirical ...
2025
-
[3]
Association between problematic tiktok use and mental health: A systematic review and meta-analysis.AIMS Public Health, 12(2):491–519, 2025
Petros Galanis, Aglaia Katsiroumpa, Zoe Katsiroumpa, Polyxeni Mangoulia, Parisis Gallos, Ioannis Moisoglou, and Evmorfia Koukia. Association between problematic tiktok use and mental health: A systematic review and meta-analysis.AIMS Public Health, 12(2):491–519, 2025
2025
-
[4]
Potential effects of the social me- dia age ban in australia for children younger than 16 years.The Lancet Digital Health, 7(4):e235–e236, Apr
Jasmine Fardouly. Potential effects of the social me- dia age ban in australia for children younger than 16 years.The Lancet Digital Health, 7(4):e235–e236, Apr
-
[5]
doi:10.1016/j.landig.2025.01.016
ISSN 2589-7500. doi:10.1016/j.landig.2025.01.016. URL https://doi.org/10.1016/j.landig.2025.01.016. 11
-
[6]
Denmark’s government aims to ban access to social media for children un- der 15 (associated press news), 2025
Jamey Keaten. Denmark’s government aims to ban access to social media for children un- der 15 (associated press news), 2025. URL https://apnews.com/article/denmark-social-media- ban-children-7862d2a8cc590b4969c8931a01adc7f4
2025
-
[7]
What we can learn from tiktok through its research api
Francesco Corso, Francesco Pierri, and Gianmarco De Francisci Morales. What we can learn from tiktok through its research api. InCompanion Publication of the 16th ACM Web Science Conference, pages 110–114, 2024
2024
-
[8]
Max Falkenberg, Fabiana Zollo, Walter Quattrociocchi, Jürgen Pfeffer, and Andrea Baronchelli. Patterns of partisan toxicity and engagement reveal the common structure of online political communication across coun- tries.Nature Communications, 15(1):9560, Nov 2024. ISSN 2041-1723. doi:10.1038/s41467-024-53868-0. URL https://doi.org/10.1038/s41467-024-53868-0
-
[9]
BERTopic: Neural topic modeling with a class-based TF-IDF procedure
Maarten Grootendorst. Bertopic: Neural topic model- ing with a class-based tf-idf procedure.arXiv preprint arXiv:2203.05794, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[10]
XLM-T: Multilingual language mod- els in Twitter for sentiment analysis and beyond
Francesco Barbieri, Luis Espinosa Anke, and Jose Camacho-Collados. XLM-T: Multilingual language mod- els in Twitter for sentiment analysis and beyond. In Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, and Stelio...
2022
-
[11]
Detoxify
Laura Hanu and Unitary team. Detoxify. Github. https://github.com/unitaryai/detoxify, 2020
2020
-
[12]
Harri Jalonen. Social media–an arena for venting nega- tive emotions.Online Journal of Communication and Media Technologies, 4(October 2014-Special Issue):53– 70, 2014. doi:https://doi.org/10.30935/ojcmt/5704
-
[13]
Jaimie Arona Krems, Laureon A. Merrie, Nina N. Rodriguez, and Keelah E.G. Williams. Venting makes people prefer—and preferentially support—us over those we vent about.Evolution and Human Behavior, 45(5):106608, 2024. ISSN 1090-5138. doi: https://doi.org/10.1016/j.evolhumbehav.2024.106608. URL https://www.sciencedirect.com/science/article/ pii/S1090513824000849
-
[14]
# smiling,# venting, or both? adolescents’ social sharing of emotions on social media.Computers in Hu- man Behavior, 84:211–219, 2018
Anne Vermeulen, Heidi Vandebosch, and Wannes Heir- man. # smiling,# venting, or both? adolescents’ social sharing of emotions on social media.Computers in Hu- man Behavior, 84:211–219, 2018
2018
-
[15]
Deconstructing tiktok videos on mental health: cross-sectional, descriptive content analysis.JMIR for- mative research, 6(5):e38340, 2022
Corey H Basch, Lorie Donelle, Joseph Fera, and Christie Jaime. Deconstructing tiktok videos on mental health: cross-sectional, descriptive content analysis.JMIR for- mative research, 6(5):e38340, 2022
2022
-
[16]
Tiktok’s research api: Problems without explanations.arXiv preprint arXiv:2506.09746, 2025
Carlos Entrena-Serrano, Martin Degeling, Salvatore Ro- mano, and Raziye Buse Çetin. Tiktok’s research api: Problems without explanations.arXiv preprint arXiv:2506.09746, 2025
-
[17]
Experi- ences of censorship on tiktok across marginalised identi- ties
Eddie L Ungless, Nina Markl, and Björn Ross. Experi- ences of censorship on tiktok across marginalised identi- ties. InProceedings of the International AAAI Confer- ence on Web and Social Media, volume 19, pages 1952– 1965, 2025
1952
-
[18]
Jordi Guillem Condom Tibau, Angelina Voggenreiter, elena pavan, and Jürgen Pfeffer. Prevalence, substance and responses to hate speech against lgbtq communi- ties on tiktok.Proceedings of the International AAAI Conference on Web and Social Media, 19(1):430–442, Jun. 2025. doi:10.1609/icwsm.v19i1.35824. URL https: //ojs.aaai.org/index.php/ICWSM/article/view/35824
-
[19]
Persistent inter- action patterns across social media platforms and over time.Nature, 628(8008):582–589, Apr 2024
Michele Avalle, Niccolò Di Marco, Gabriele Etta, Emanuele Sangiorgio, Shayan Alipour, Anita Bonetti, Lorenzo Alvisi, Antonio Scala, Andrea Baronchelli, Mat- teo Cinelli, and Walter Quattrociocchi. Persistent inter- action patterns across social media platforms and over time.Nature, 628(8008):582–589, Apr 2024. ISSN 1476-
2024
-
[20]
doi:10.1038/s41586-024-07229-y. URL https://doi. org/10.1038/s41586-024-07229-y
-
[21]
Scrolling through adolescence: a system- atic review of the impact of tiktok on adolescent mental health.European Child & Adolescent Psychiatry, 34(5): 1511–1527, 2025
Giulia Conte, Giorgia Di Iorio, Dario Esposito, Sara Ro- mano, Fabiola Panvino, Susanna Maggi, Benedetta Al- tomonte, Maria Pia Casini, Mauro Ferrara, and Ari- anna Terrinoni. Scrolling through adolescence: a system- atic review of the impact of tiktok on adolescent mental health.European Child & Adolescent Psychiatry, 34(5): 1511–1527, 2025
2025
-
[22]
Using tiktok for public and youth mental health–a systematic review and content analysis.Clinical child psychology and psy- chiatry, 28(1):279–306, 2023
Darragh McCashin and Colette M Murphy. Using tiktok for public and youth mental health–a systematic review and content analysis.Clinical child psychology and psy- chiatry, 28(1):279–306, 2023
2023
-
[23]
Mingyue Zha and Ho-Chun Herbert Chang. Interpreting multimodal communication at scale in short-form video: Visual, audio, and textual mental health discourse on tiktok.arXiv preprint arXiv:2601.15278, 2026
-
[24]
i see me here
Ashlee Milton, Leah Ajmani, Michael Ann DeVito, and Stevie Chancellor. “i see me here”: mental health content, community, and algorithmic curation on tiktok. InPro- ceedings of the 2023 CHI conference on human factors in computing systems, pages 1–17, 2023
2023
-
[25]
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for di- mension reduction.arXiv preprint arXiv:1802.03426, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[26]
hdb- scan: Hierarchical density based clustering.The Journal of Open Source Software, 2(11):205, 2017
Leland McInnes, John Healy, and Steve Astels. hdb- scan: Hierarchical density based clustering.The Journal of Open Source Software, 2(11):205, 2017
2017
-
[27]
Density-based clustering based on hierarchical density estimates
Ricardo JGB Campello, Davoud Moulavi, and Jörg Sander. Density-based clustering based on hierarchical density estimates. InPacific-Asia conference on knowl- edge discovery and data mining, pages 160–172. Springer, 2013
2013
-
[28]
Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict.Political Analysis, 16(4):372–403, 2008
Burt L Monroe, Michael P Colaresi, and Kevin M Quinn. Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict.Political Analysis, 16(4):372–403, 2008
2008
-
[29]
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual rep- resentation learning at scale. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors,Pro- ceedings of the 58th Annual Meeting of the A...
-
[30]
Ouellette, Dhiraj Murthy, Ben Pretzer, Tanvi Anand, and Grace Kong
Juhan Lee, Rachel R. Ouellette, Dhiraj Murthy, Ben Pretzer, Tanvi Anand, and Grace Kong. Identifying e- 12 cigarette content on TikTok: Using a BERTopic model- ing approach.Nicotine & Tobacco Research, 27(1):91–96,
-
[31]
doi:10.1093/ntr/ntae171. URL https://doi.org/10. 1093/ntr/ntae171
-
[32]
Roman Egger and Joanne Yu. A topic modeling com- parison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts.Frontiers in Sociology, 7: 886498, 2022. doi:10.3389/fsoc.2022.886498. URL https: //doi.org/10.3389/fsoc.2022.886498
-
[33]
Dhiraj Murthy, Simran Keshari, Srishty Arora, Q. Yang, A. Loukas, S. J. Schwartz, M. B. Harrell, E. T. Hébert, and A. V. Wilkinson. Categorizing e-cigarette-related tweets using BERT topic modeling.Emerging Trends in Drugs, Addictions, and Health, 4:100160, 2024. doi: 10.1016/j.etdah.2024.100160. URL https://doi.org/10. 1016/j.etdah.2024.100160
-
[34]
Cai Yang, Sepehr Mousavi, Abhisek Dash, Krishna P. Gummadi, and Ingmar Weber. Studying behavioral addiction by combining surveys and digital traces: A case study of TikTok.Proceedings of the International AAAI Conference on Web and Social Media, 19(1):2106– 2123, 2025. doi:10.1609/icwsm.v19i1.35922. URL https: //doi.org/10.1609/icwsm.v19i1.35922
-
[35]
Towards an automated framework to audit youth safety on Tik- Tok
Linda Xue, Francesco Corso, Nicolo Fontana, Geng Liu, Stefano Ceri, and Francesco Pierri. Towards an automated framework to audit youth safety on Tik- Tok. InProceedings of the Fourth Workshop on Bridg- ing Human-Computer Interaction and Natural Language Processing (HCI+NLP), pages 113–119, Suzhou, China,
-
[36]
doi: 10.18653/v1/2025.hcinlp-1.9
Association for Computational Linguistics. doi: 10.18653/v1/2025.hcinlp-1.9. URL https://aclanthology. org/2025.hcinlp-1.9/
-
[37]
Nonparametric statistical methods
Myles Hollander, Douglas A Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley & Sons, 2013
2013
-
[38]
Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300, 1995
Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300, 1995
1995
-
[39]
am i a sunrise or a sunset?
Allison Degrushe. “am i a sunrise or a sunset?” — understanding the meaning of tiktok’s latest trend,
-
[40]
URL https://www.distractify.com/p/sunrise-or- sunset-meaning-tiktok
-
[41]
Americans overestimate how many social me- dia users post harmful content.PNAS nexus, 4(12): pgaf310, 2025
Angela Y Lee, Eric Neumann, Jamil Zaki, and Jeffrey Hancock. Americans overestimate how many social me- dia users post harmful content.PNAS nexus, 4(12): pgaf310, 2025
2025
-
[42]
Just another hour on tiktok: Id sam- pling to obtain a complete slice of tiktok, 2026
Benjamin Steel, Miriam Schirmer, Derek Ruths, and Juergen Pfeffer. Just another hour on tiktok: Id sam- pling to obtain a complete slice of tiktok, 2026. URL https://journalqd.org/article/view/9514
2026
-
[43]
Depression is the leading cause of disability around the world.Jama, 317(15):1517–1517, 2017
Mary Jane Friedrich. Depression is the leading cause of disability around the world.Jama, 317(15):1517–1517, 2017
2017
-
[44]
WHO. Suicide. https://www.who.int/news-room/fact- sheets/detail/suicide, 2025. [Accessed 14-01-2026]. Appendix A: Hyperparameters used for topic modeling Sentence Embeddings Embedding modelall-mpnet-base-v2 UMAP Neighbors (k) 20 Embedding dim. 10 (2 for visualization) Min. distance 0.05 Training epochs 50,000 Distance metric cosine HDBSCAN Min. cluster si...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.