The Tone of Awareness: Topic, Sentiment, and Toxicity Maps During Mental Health Month on TikTok

Andreia Sofia Teixeira; Anindya Mondal; Filipi Nascimento Silva; Henrique Ferraz de Arruda; Kleber Andrade Oliveira; Pranay Gundala Reddy

arxiv: 2606.13581 · v1 · pith:GMB2LMQ3new · submitted 2026-06-11 · 💻 cs.CY · cs.CL· cs.HC· physics.soc-ph

The Tone of Awareness: Topic, Sentiment, and Toxicity Maps During Mental Health Month on TikTok

Henrique Ferraz de Arruda , Andreia Sofia Teixeira , Pranay Gundala Reddy , Anindya Mondal , Kleber Andrade Oliveira , Filipi Nascimento Silva This is my paper

Pith reviewed 2026-06-27 05:14 UTC · model grok-4.3

classification 💻 cs.CY cs.CLcs.HCphysics.soc-ph

keywords TikTokmental healthsentiment analysistoxicity detectiontopic modelingsocial mediaawareness campaigns

0 comments

The pith

Mental health videos on TikTok carry negative sentiment while comments shift toward positive polarity, especially on suicide prevention.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper collects over 28,000 TikTok videos and 80,000 comments posted during Mental Health Awareness Month in 2023 and 2024. It identifies a stable set of recurring topics such as clinical conditions, emotional disclosure, self-care, and campaign content, with most engagement concentrated on a small subset. Sentiment analysis shows videos often negative on emotionally charged topics, but comments tend to become more mixed or positive. Toxicity remains low on average yet produces longer-tailed outliers in comments, concentrated on specific topics. This decomposition separates how creators frame awareness from how audiences receive it.

Core claim

A stable collection of topics appears across both years, with engagement heavily skewed toward a few of them. Video sentiment is frequently negative for emotionally charged subjects, whereas comments move toward mixed or positive polarity, most clearly on suicide prevention. Toxicity scores stay low in the median but display heavier tails in comments than in videos, with the outliers clustered around particular topics.

What carries the argument

BERTopic topic extraction applied to video text, followed by separate XLM-T sentiment scoring and Detoxify toxicity scoring on video transcripts versus comments.

If this is right

Engagement concentrates on a small number of topics across both years.
Video content on emotionally charged topics tends to carry negative sentiment.
Comment sentiment becomes more positive than video sentiment, particularly under suicide prevention topics.
Toxicity outliers appear more often and more extremely in comments than in videos and cluster on specific topics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Audience comments may soften the negative tone set by video creators on sensitive subjects.
Awareness-month campaigns reach viewers unevenly because a few topics capture most interaction.
Platform moderation could focus on the small set of topics that generate the longest toxicity tails.
Repeating the same analysis on non-campaign periods would test whether the observed topic stability is unique to awareness months.

Load-bearing premise

The pre-trained sentiment and toxicity models measure the intended emotional and interpersonal framing of TikTok mental health text without needing domain-specific adjustment.

What would settle it

A hand-labeled sample of several hundred TikTok mental health videos and comments on which the XLM-T and Detoxify models disagree with human raters at rates above 30 percent would undermine the reported sentiment and toxicity patterns.

Figures

Figures reproduced from arXiv: 2606.13581 by Andreia Sofia Teixeira, Anindya Mondal, Filipi Nascimento Silva, Henrique Ferraz de Arruda, Kleber Andrade Oliveira, Pranay Gundala Reddy.

**Figure 1.** Figure 1: FIG. 1. Number of mental health-related posts and comments from the two periods: (a) 2023 and (b) 2024. The interval [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: FIG. 2. Schematic representation of our dataset collection [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. Zoomed-in region of the UMAP projection represent [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. Fractions of posts per topic for the years 2023 and [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. Fractions of comments by topic [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. Toxicity and sentiment polarity results for posts and comments. Panels (a)-(c) display the distributions of Detoxify [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Despite raising concerns about the mental health effects associated with the usage of TikTok, little is known about how related content is framed by creators and received by audiences. We collect the content of 28,341 TikTok videos and 80,130 comments from Mental Health Awareness Month (May) in 2023 and 2024 via the TikTok Research API, and study how the tone of awareness varies across topics and years. We characterize "tone" as the emotional and interpersonal framing of mental health discourse, operationalized through sentiment and toxicity measures. We extract topics from video text using BERTopic and log-odds keywords, then quantify topic-conditioned sentiment (XLM-T) and toxicity (Detoxify) separately for video transcriptions and comments. Sentiment captures the affective valence of content, while toxicity reflects the presence of harmful or abusive language. We find a stable set of recurring themes across years, spanning clinical conditions, emotional disclosure, self-care, and campaign-oriented content, with engagement highly skewed toward a small subset of topics. All sentiment and toxicity analyses are computed separately for video content and comments, allowing us to distinguish between content production and audience reception. Sentiment in videos is often negative for emotionally charged topics, while comments tend to shift toward more mixed or positive polarity, especially for suicide prevention. Toxicity is low in median overall, but exhibits longer-tailed outliers in comments than in videos that are more pronounced in comments and concentrated in specific topics (e.g., "Duet", "Suicide Prevention", and "Psychisch"). Overall, our results provide a topic-level decomposition of mental health discourse on TikTok during awareness-month campaigns.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Descriptive mapping of TikTok mental health topics and video-vs-comment tone using off-the-shelf NLP tools, with the main weakness being lack of model validation on this domain.

read the letter

This paper pulls 28k TikTok videos and 80k comments from Mental Health Awareness Month in 2023 and 2024, runs BERTopic on the video text to surface recurring themes, and then scores sentiment and toxicity separately on videos and comments.

What is actually new is the dataset itself plus the topic-conditioned breakdown that separates creator output from audience reception. The findings on stable themes across years, skewed engagement, more negative video sentiment on charged topics, and comment polarity shifts (especially around suicide prevention) are straightforward empirical observations.

The work is competent at applying established pipelines to a fresh platform-specific corpus and at highlighting the video-comment contrast, which is a useful angle for platform discourse studies.

The soft spot is exactly the one flagged in the stress test. XLM-T and Detoxify are applied without any reported fine-tuning, human validation, or error analysis on TikTok mental health text. That text is short, emoji-heavy, slangy, and often ironic or multilingual. General models frequently misalign with human judgment in this setting, so the topic-level sentiment and toxicity maps rest on an untested assumption. If those scores are off, the claimed patterns lose their basis.

This is for people working on social media and mental health content analysis. It gives a granular descriptive view that could be cited for the dataset or the video-comment distinction, but the tone results need to be treated as preliminary. It deserves a serious referee because the data collection is timely and the core question is reasonable, even though the methods section will need work on validation.

Referee Report

1 major / 0 minor

Summary. The paper collects 28,341 TikTok videos and 80,130 comments posted during Mental Health Awareness Month (May) in 2023 and 2024 via the TikTok Research API. It applies BERTopic to extract topics from video transcripts, then computes topic-conditioned sentiment via the pre-trained XLM-T model and toxicity via Detoxify, separately on video text and comments. The central claims are that recurring themes (clinical conditions, emotional disclosure, self-care, campaign content) are stable across years with highly skewed engagement; video sentiment is often negative on emotionally charged topics while comments shift toward mixed/positive polarity (especially suicide prevention); and toxicity is low in median but shows longer-tailed outliers in comments, concentrated in specific topics.

Significance. If the off-the-shelf sentiment and toxicity scores prove reliable on this corpus, the work supplies a useful large-scale, topic-resolved map of mental-health discourse that distinguishes creator framing from audience reception and documents stability across two campaign years. The scale of the TikTok Research API sample and the separation of video vs. comment analyses are clear strengths for an empirical measurement study in this domain.

major comments (1)

[Methods description of sentiment and toxicity quantification (abstract and corresponding methods section)] The central claims about negative video sentiment for charged topics, positive comment polarity shifts (especially suicide prevention), and topic-specific toxicity tails rest entirely on the outputs of the pre-trained XLM-T and Detoxify models applied without domain adaptation, human annotation validation, or error analysis on the TikTok mental-health corpus. TikTok text is short-form, emoji-laden, slang-heavy, and often ironic or multilingual; general-domain models frequently misalign with human judgments on such data. No section of the manuscript reports fine-tuning, inter-annotator agreement, or even a small held-out validation set against which model accuracy could be assessed.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their detailed feedback on our manuscript. The central methodological concern regarding validation of the sentiment and toxicity models is addressed point-by-point below.

read point-by-point responses

Referee: [Methods description of sentiment and toxicity quantification (abstract and corresponding methods section)] The central claims about negative video sentiment for charged topics, positive comment polarity shifts (especially suicide prevention), and topic-specific toxicity tails rest entirely on the outputs of the pre-trained XLM-T and Detoxify models applied without domain adaptation, human annotation validation, or error analysis on the TikTok mental-health corpus. TikTok text is short-form, emoji-laden, slang-heavy, and often ironic or multilingual; general-domain models frequently misalign with human judgments on such data. No section of the manuscript reports fine-tuning, inter-annotator agreement, or even a small held-out validation set against which model accuracy could be assessed.

Authors: We agree that the manuscript does not report domain adaptation, human validation, or error analysis for the XLM-T and Detoxify models on this TikTok corpus. These off-the-shelf models were applied due to their established use in social media sentiment and toxicity detection, including short-form content, without fine-tuning to maintain reproducibility and scale. We acknowledge that TikTok-specific features such as emojis, slang, and irony may affect alignment with human judgments, and that the absence of a validation set is a limitation. We will revise the manuscript to add an explicit discussion of model applicability and potential limitations in the Methods section, along with a new Limitations subsection in the Discussion that references prior evaluations of these models on similar platforms. We cannot provide new human annotations or inter-annotator agreement metrics, as these were not collected in the original study. revision: partial

standing simulated objections not resolved

Empirical validation of the XLM-T and Detoxify models via human annotation or error analysis on the TikTok mental-health corpus

Circularity Check

0 steps flagged

No circularity: empirical measurement study with direct application of external models

full rationale

The paper collects TikTok data and applies BERTopic for topics plus off-the-shelf XLM-T and Detoxify models for sentiment/toxicity. No equations, fitted parameters, derivations, or self-citations appear in the provided text or abstract. All reported patterns (stable themes, sentiment shifts, toxicity tails) are direct outputs of these external tools on the corpus, with no reduction of results to inputs by construction. This matches the default case of a self-contained empirical study.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Relies on the validity of off-the-shelf NLP models for TikTok text and the assumption that extracted topics reflect meaningful discourse categories.

axioms (1)

domain assumption Pre-trained BERTopic, XLM-T, and Detoxify models produce reliable topic, sentiment, and toxicity labels when applied directly to TikTok mental health transcripts and comments.
Paper applies these models without reported fine-tuning or domain validation.

pith-pipeline@v0.9.1-grok · 5868 in / 1205 out tokens · 27043 ms · 2026-06-27T05:14:19.202424+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 16 canonical work pages · 2 internal anchors

[1]

Over a billion people living with mental health conditions – services require urgent scale-up

WHO. Over a billion people living with mental health conditions – services require urgent scale-up. https://www.who.int/news/item/02-09-2025-over-a- billion-people-living-with-mental-health-conditions- services-require-urgent-scale-up, 2025. [Accessed 14-01-2026]

2025
[2]

Exploring problematic tiktok use and 9 Available at https://www.tiktok.com/legal/page/global/terms- of-service-research-api/en

Lakshit Jain, Luis Velez, Surya Karlapati, Mary Forand, Rajasekhar Kannali, Rao Ahmed Yousaf, Rizwan Ahmed, Zouina Sarfraz, Pearl A Sutter, Christian An- thony Tallo, et al. Exploring problematic tiktok use and 9 Available at https://www.tiktok.com/legal/page/global/terms- of-service-research-api/en. mental health issues: A systematic review of empirical ...

2025
[3]

Association between problematic tiktok use and mental health: A systematic review and meta-analysis.AIMS Public Health, 12(2):491–519, 2025

Petros Galanis, Aglaia Katsiroumpa, Zoe Katsiroumpa, Polyxeni Mangoulia, Parisis Gallos, Ioannis Moisoglou, and Evmorfia Koukia. Association between problematic tiktok use and mental health: A systematic review and meta-analysis.AIMS Public Health, 12(2):491–519, 2025

2025
[4]

Potential effects of the social me- dia age ban in australia for children younger than 16 years.The Lancet Digital Health, 7(4):e235–e236, Apr

Jasmine Fardouly. Potential effects of the social me- dia age ban in australia for children younger than 16 years.The Lancet Digital Health, 7(4):e235–e236, Apr
[5]

doi:10.1016/j.landig.2025.01.016

ISSN 2589-7500. doi:10.1016/j.landig.2025.01.016. URL https://doi.org/10.1016/j.landig.2025.01.016. 11

work page doi:10.1016/j.landig.2025.01.016 2025
[6]

Denmark’s government aims to ban access to social media for children un- der 15 (associated press news), 2025

Jamey Keaten. Denmark’s government aims to ban access to social media for children un- der 15 (associated press news), 2025. URL https://apnews.com/article/denmark-social-media- ban-children-7862d2a8cc590b4969c8931a01adc7f4

2025
[7]

What we can learn from tiktok through its research api

Francesco Corso, Francesco Pierri, and Gianmarco De Francisci Morales. What we can learn from tiktok through its research api. InCompanion Publication of the 16th ACM Web Science Conference, pages 110–114, 2024

2024
[8]

Patterns of partisan toxicity and engagement reveal the common structure of online political communication across coun- tries.Nature Communications, 15(1):9560, Nov 2024

Max Falkenberg, Fabiana Zollo, Walter Quattrociocchi, Jürgen Pfeffer, and Andrea Baronchelli. Patterns of partisan toxicity and engagement reveal the common structure of online political communication across coun- tries.Nature Communications, 15(1):9560, Nov 2024. ISSN 2041-1723. doi:10.1038/s41467-024-53868-0. URL https://doi.org/10.1038/s41467-024-53868-0

work page doi:10.1038/s41467-024-53868-0 2024
[9]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

Maarten Grootendorst. Bertopic: Neural topic model- ing with a class-based tf-idf procedure.arXiv preprint arXiv:2203.05794, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[10]

XLM-T: Multilingual language mod- els in Twitter for sentiment analysis and beyond

Francesco Barbieri, Luis Espinosa Anke, and Jose Camacho-Collados. XLM-T: Multilingual language mod- els in Twitter for sentiment analysis and beyond. In Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, and Stelio...

2022
[11]

Detoxify

Laura Hanu and Unitary team. Detoxify. Github. https://github.com/unitaryai/detoxify, 2020

2020
[12]

Social media–an arena for venting nega- tive emotions.Online Journal of Communication and Media Technologies, 4(October 2014-Special Issue):53– 70, 2014

Harri Jalonen. Social media–an arena for venting nega- tive emotions.Online Journal of Communication and Media Technologies, 4(October 2014-Special Issue):53– 70, 2014. doi:https://doi.org/10.30935/ojcmt/5704

work page doi:10.30935/ojcmt/5704 2014
[13]

Merrie, Nina N

Jaimie Arona Krems, Laureon A. Merrie, Nina N. Rodriguez, and Keelah E.G. Williams. Venting makes people prefer—and preferentially support—us over those we vent about.Evolution and Human Behavior, 45(5):106608, 2024. ISSN 1090-5138. doi: https://doi.org/10.1016/j.evolhumbehav.2024.106608. URL https://www.sciencedirect.com/science/article/ pii/S1090513824000849

work page doi:10.1016/j.evolhumbehav.2024.106608 2024
[14]

# smiling,# venting, or both? adolescents’ social sharing of emotions on social media.Computers in Hu- man Behavior, 84:211–219, 2018

Anne Vermeulen, Heidi Vandebosch, and Wannes Heir- man. # smiling,# venting, or both? adolescents’ social sharing of emotions on social media.Computers in Hu- man Behavior, 84:211–219, 2018

2018
[15]

Deconstructing tiktok videos on mental health: cross-sectional, descriptive content analysis.JMIR for- mative research, 6(5):e38340, 2022

Corey H Basch, Lorie Donelle, Joseph Fera, and Christie Jaime. Deconstructing tiktok videos on mental health: cross-sectional, descriptive content analysis.JMIR for- mative research, 6(5):e38340, 2022

2022
[16]

Tiktok’s research api: Problems without explanations.arXiv preprint arXiv:2506.09746, 2025

Carlos Entrena-Serrano, Martin Degeling, Salvatore Ro- mano, and Raziye Buse Çetin. Tiktok’s research api: Problems without explanations.arXiv preprint arXiv:2506.09746, 2025

work page arXiv 2025
[17]

Experi- ences of censorship on tiktok across marginalised identi- ties

Eddie L Ungless, Nina Markl, and Björn Ross. Experi- ences of censorship on tiktok across marginalised identi- ties. InProceedings of the International AAAI Confer- ence on Web and Social Media, volume 19, pages 1952– 1965, 2025

1952
[18]

Jordi Guillem Condom Tibau, Angelina Voggenreiter, elena pavan, and Jürgen Pfeffer. Prevalence, substance and responses to hate speech against lgbtq communi- ties on tiktok.Proceedings of the International AAAI Conference on Web and Social Media, 19(1):430–442, Jun. 2025. doi:10.1609/icwsm.v19i1.35824. URL https: //ojs.aaai.org/index.php/ICWSM/article/view/35824

work page doi:10.1609/icwsm.v19i1.35824 2025
[19]

Persistent inter- action patterns across social media platforms and over time.Nature, 628(8008):582–589, Apr 2024

Michele Avalle, Niccolò Di Marco, Gabriele Etta, Emanuele Sangiorgio, Shayan Alipour, Anita Bonetti, Lorenzo Alvisi, Antonio Scala, Andrea Baronchelli, Mat- teo Cinelli, and Walter Quattrociocchi. Persistent inter- action patterns across social media platforms and over time.Nature, 628(8008):582–589, Apr 2024. ISSN 1476-

2024
[20]

URL https://doi

doi:10.1038/s41586-024-07229-y. URL https://doi. org/10.1038/s41586-024-07229-y

work page doi:10.1038/s41586-024-07229-y
[21]

Scrolling through adolescence: a system- atic review of the impact of tiktok on adolescent mental health.European Child & Adolescent Psychiatry, 34(5): 1511–1527, 2025

Giulia Conte, Giorgia Di Iorio, Dario Esposito, Sara Ro- mano, Fabiola Panvino, Susanna Maggi, Benedetta Al- tomonte, Maria Pia Casini, Mauro Ferrara, and Ari- anna Terrinoni. Scrolling through adolescence: a system- atic review of the impact of tiktok on adolescent mental health.European Child & Adolescent Psychiatry, 34(5): 1511–1527, 2025

2025
[22]

Using tiktok for public and youth mental health–a systematic review and content analysis.Clinical child psychology and psy- chiatry, 28(1):279–306, 2023

Darragh McCashin and Colette M Murphy. Using tiktok for public and youth mental health–a systematic review and content analysis.Clinical child psychology and psy- chiatry, 28(1):279–306, 2023

2023
[23]

Interpreting multimodal communication at scale in short-form video: Visual, audio, and textual mental health discourse on tiktok.arXiv preprint arXiv:2601.15278, 2026

Mingyue Zha and Ho-Chun Herbert Chang. Interpreting multimodal communication at scale in short-form video: Visual, audio, and textual mental health discourse on tiktok.arXiv preprint arXiv:2601.15278, 2026

work page arXiv 2026
[24]

i see me here

Ashlee Milton, Leah Ajmani, Michael Ann DeVito, and Stevie Chancellor. “i see me here”: mental health content, community, and algorithmic curation on tiktok. InPro- ceedings of the 2023 CHI conference on human factors in computing systems, pages 1–17, 2023

2023
[25]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for di- mension reduction.arXiv preprint arXiv:1802.03426, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[26]

hdb- scan: Hierarchical density based clustering.The Journal of Open Source Software, 2(11):205, 2017

Leland McInnes, John Healy, and Steve Astels. hdb- scan: Hierarchical density based clustering.The Journal of Open Source Software, 2(11):205, 2017

2017
[27]

Density-based clustering based on hierarchical density estimates

Ricardo JGB Campello, Davoud Moulavi, and Jörg Sander. Density-based clustering based on hierarchical density estimates. InPacific-Asia conference on knowl- edge discovery and data mining, pages 160–172. Springer, 2013

2013
[28]

Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict.Political Analysis, 16(4):372–403, 2008

Burt L Monroe, Michael P Colaresi, and Kevin M Quinn. Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict.Political Analysis, 16(4):372–403, 2008

2008
[29]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual rep- resentation learning at scale. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors,Pro- ceedings of the 58th Annual Meeting of the A...

work page doi:10.18653/v1/2020.acl-main.747 2020
[30]

Ouellette, Dhiraj Murthy, Ben Pretzer, Tanvi Anand, and Grace Kong

Juhan Lee, Rachel R. Ouellette, Dhiraj Murthy, Ben Pretzer, Tanvi Anand, and Grace Kong. Identifying e- 12 cigarette content on TikTok: Using a BERTopic model- ing approach.Nicotine & Tobacco Research, 27(1):91–96,
[31]

URL https://doi.org/10

doi:10.1093/ntr/ntae171. URL https://doi.org/10. 1093/ntr/ntae171

work page doi:10.1093/ntr/ntae171
[32]

A topic modeling com- parison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts.Frontiers in Sociology, 7: 886498, 2022

Roman Egger and Joanne Yu. A topic modeling com- parison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts.Frontiers in Sociology, 7: 886498, 2022. doi:10.3389/fsoc.2022.886498. URL https: //doi.org/10.3389/fsoc.2022.886498

work page doi:10.3389/fsoc.2022.886498 2022
[33]

Dhiraj Murthy, Simran Keshari, Srishty Arora, Q. Yang, A. Loukas, S. J. Schwartz, M. B. Harrell, E. T. Hébert, and A. V. Wilkinson. Categorizing e-cigarette-related tweets using BERT topic modeling.Emerging Trends in Drugs, Addictions, and Health, 4:100160, 2024. doi: 10.1016/j.etdah.2024.100160. URL https://doi.org/10. 1016/j.etdah.2024.100160

work page doi:10.1016/j.etdah.2024.100160 2024
[34]

Gummadi, and Ingmar Weber

Cai Yang, Sepehr Mousavi, Abhisek Dash, Krishna P. Gummadi, and Ingmar Weber. Studying behavioral addiction by combining surveys and digital traces: A case study of TikTok.Proceedings of the International AAAI Conference on Web and Social Media, 19(1):2106– 2123, 2025. doi:10.1609/icwsm.v19i1.35922. URL https: //doi.org/10.1609/icwsm.v19i1.35922

work page doi:10.1609/icwsm.v19i1.35922 2025
[35]

Towards an automated framework to audit youth safety on Tik- Tok

Linda Xue, Francesco Corso, Nicolo Fontana, Geng Liu, Stefano Ceri, and Francesco Pierri. Towards an automated framework to audit youth safety on Tik- Tok. InProceedings of the Fourth Workshop on Bridg- ing Human-Computer Interaction and Natural Language Processing (HCI+NLP), pages 113–119, Suzhou, China,
[36]

doi: 10.18653/v1/2025.hcinlp-1.9

Association for Computational Linguistics. doi: 10.18653/v1/2025.hcinlp-1.9. URL https://aclanthology. org/2025.hcinlp-1.9/

work page doi:10.18653/v1/2025.hcinlp-1.9 2025
[37]

Nonparametric statistical methods

Myles Hollander, Douglas A Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley & Sons, 2013

2013
[38]

Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300, 1995

Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300, 1995

1995
[39]

am i a sunrise or a sunset?

Allison Degrushe. “am i a sunrise or a sunset?” — understanding the meaning of tiktok’s latest trend,
[40]

URL https://www.distractify.com/p/sunrise-or- sunset-meaning-tiktok
[41]

Americans overestimate how many social me- dia users post harmful content.PNAS nexus, 4(12): pgaf310, 2025

Angela Y Lee, Eric Neumann, Jamil Zaki, and Jeffrey Hancock. Americans overestimate how many social me- dia users post harmful content.PNAS nexus, 4(12): pgaf310, 2025

2025
[42]

Just another hour on tiktok: Id sam- pling to obtain a complete slice of tiktok, 2026

Benjamin Steel, Miriam Schirmer, Derek Ruths, and Juergen Pfeffer. Just another hour on tiktok: Id sam- pling to obtain a complete slice of tiktok, 2026. URL https://journalqd.org/article/view/9514

2026
[43]

Depression is the leading cause of disability around the world.Jama, 317(15):1517–1517, 2017

Mary Jane Friedrich. Depression is the leading cause of disability around the world.Jama, 317(15):1517–1517, 2017

2017
[44]

WHO. Suicide. https://www.who.int/news-room/fact- sheets/detail/suicide, 2025. [Accessed 14-01-2026]. Appendix A: Hyperparameters used for topic modeling Sentence Embeddings Embedding modelall-mpnet-base-v2 UMAP Neighbors (k) 20 Embedding dim. 10 (2 for visualization) Min. distance 0.05 Training epochs 50,000 Distance metric cosine HDBSCAN Min. cluster si...

2025

[1] [1]

Over a billion people living with mental health conditions – services require urgent scale-up

WHO. Over a billion people living with mental health conditions – services require urgent scale-up. https://www.who.int/news/item/02-09-2025-over-a- billion-people-living-with-mental-health-conditions- services-require-urgent-scale-up, 2025. [Accessed 14-01-2026]

2025

[2] [2]

Exploring problematic tiktok use and 9 Available at https://www.tiktok.com/legal/page/global/terms- of-service-research-api/en

Lakshit Jain, Luis Velez, Surya Karlapati, Mary Forand, Rajasekhar Kannali, Rao Ahmed Yousaf, Rizwan Ahmed, Zouina Sarfraz, Pearl A Sutter, Christian An- thony Tallo, et al. Exploring problematic tiktok use and 9 Available at https://www.tiktok.com/legal/page/global/terms- of-service-research-api/en. mental health issues: A systematic review of empirical ...

2025

[3] [3]

Association between problematic tiktok use and mental health: A systematic review and meta-analysis.AIMS Public Health, 12(2):491–519, 2025

Petros Galanis, Aglaia Katsiroumpa, Zoe Katsiroumpa, Polyxeni Mangoulia, Parisis Gallos, Ioannis Moisoglou, and Evmorfia Koukia. Association between problematic tiktok use and mental health: A systematic review and meta-analysis.AIMS Public Health, 12(2):491–519, 2025

2025

[4] [4]

Potential effects of the social me- dia age ban in australia for children younger than 16 years.The Lancet Digital Health, 7(4):e235–e236, Apr

Jasmine Fardouly. Potential effects of the social me- dia age ban in australia for children younger than 16 years.The Lancet Digital Health, 7(4):e235–e236, Apr

[5] [5]

doi:10.1016/j.landig.2025.01.016

ISSN 2589-7500. doi:10.1016/j.landig.2025.01.016. URL https://doi.org/10.1016/j.landig.2025.01.016. 11

work page doi:10.1016/j.landig.2025.01.016 2025

[6] [6]

Denmark’s government aims to ban access to social media for children un- der 15 (associated press news), 2025

Jamey Keaten. Denmark’s government aims to ban access to social media for children un- der 15 (associated press news), 2025. URL https://apnews.com/article/denmark-social-media- ban-children-7862d2a8cc590b4969c8931a01adc7f4

2025

[7] [7]

What we can learn from tiktok through its research api

Francesco Corso, Francesco Pierri, and Gianmarco De Francisci Morales. What we can learn from tiktok through its research api. InCompanion Publication of the 16th ACM Web Science Conference, pages 110–114, 2024

2024

[8] [8]

Patterns of partisan toxicity and engagement reveal the common structure of online political communication across coun- tries.Nature Communications, 15(1):9560, Nov 2024

Max Falkenberg, Fabiana Zollo, Walter Quattrociocchi, Jürgen Pfeffer, and Andrea Baronchelli. Patterns of partisan toxicity and engagement reveal the common structure of online political communication across coun- tries.Nature Communications, 15(1):9560, Nov 2024. ISSN 2041-1723. doi:10.1038/s41467-024-53868-0. URL https://doi.org/10.1038/s41467-024-53868-0

work page doi:10.1038/s41467-024-53868-0 2024

[9] [9]

BERTopic: Neural topic modeling with a class-based TF-IDF procedure

Maarten Grootendorst. Bertopic: Neural topic model- ing with a class-based tf-idf procedure.arXiv preprint arXiv:2203.05794, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[10] [10]

XLM-T: Multilingual language mod- els in Twitter for sentiment analysis and beyond

Francesco Barbieri, Luis Espinosa Anke, and Jose Camacho-Collados. XLM-T: Multilingual language mod- els in Twitter for sentiment analysis and beyond. In Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, and Stelio...

2022

[11] [11]

Detoxify

Laura Hanu and Unitary team. Detoxify. Github. https://github.com/unitaryai/detoxify, 2020

2020

[12] [12]

Social media–an arena for venting nega- tive emotions.Online Journal of Communication and Media Technologies, 4(October 2014-Special Issue):53– 70, 2014

Harri Jalonen. Social media–an arena for venting nega- tive emotions.Online Journal of Communication and Media Technologies, 4(October 2014-Special Issue):53– 70, 2014. doi:https://doi.org/10.30935/ojcmt/5704

work page doi:10.30935/ojcmt/5704 2014

[13] [13]

Merrie, Nina N

Jaimie Arona Krems, Laureon A. Merrie, Nina N. Rodriguez, and Keelah E.G. Williams. Venting makes people prefer—and preferentially support—us over those we vent about.Evolution and Human Behavior, 45(5):106608, 2024. ISSN 1090-5138. doi: https://doi.org/10.1016/j.evolhumbehav.2024.106608. URL https://www.sciencedirect.com/science/article/ pii/S1090513824000849

work page doi:10.1016/j.evolhumbehav.2024.106608 2024

[14] [14]

# smiling,# venting, or both? adolescents’ social sharing of emotions on social media.Computers in Hu- man Behavior, 84:211–219, 2018

Anne Vermeulen, Heidi Vandebosch, and Wannes Heir- man. # smiling,# venting, or both? adolescents’ social sharing of emotions on social media.Computers in Hu- man Behavior, 84:211–219, 2018

2018

[15] [15]

Deconstructing tiktok videos on mental health: cross-sectional, descriptive content analysis.JMIR for- mative research, 6(5):e38340, 2022

Corey H Basch, Lorie Donelle, Joseph Fera, and Christie Jaime. Deconstructing tiktok videos on mental health: cross-sectional, descriptive content analysis.JMIR for- mative research, 6(5):e38340, 2022

2022

[16] [16]

Tiktok’s research api: Problems without explanations.arXiv preprint arXiv:2506.09746, 2025

Carlos Entrena-Serrano, Martin Degeling, Salvatore Ro- mano, and Raziye Buse Çetin. Tiktok’s research api: Problems without explanations.arXiv preprint arXiv:2506.09746, 2025

work page arXiv 2025

[17] [17]

Experi- ences of censorship on tiktok across marginalised identi- ties

Eddie L Ungless, Nina Markl, and Björn Ross. Experi- ences of censorship on tiktok across marginalised identi- ties. InProceedings of the International AAAI Confer- ence on Web and Social Media, volume 19, pages 1952– 1965, 2025

1952

[18] [18]

Jordi Guillem Condom Tibau, Angelina Voggenreiter, elena pavan, and Jürgen Pfeffer. Prevalence, substance and responses to hate speech against lgbtq communi- ties on tiktok.Proceedings of the International AAAI Conference on Web and Social Media, 19(1):430–442, Jun. 2025. doi:10.1609/icwsm.v19i1.35824. URL https: //ojs.aaai.org/index.php/ICWSM/article/view/35824

work page doi:10.1609/icwsm.v19i1.35824 2025

[19] [19]

Persistent inter- action patterns across social media platforms and over time.Nature, 628(8008):582–589, Apr 2024

Michele Avalle, Niccolò Di Marco, Gabriele Etta, Emanuele Sangiorgio, Shayan Alipour, Anita Bonetti, Lorenzo Alvisi, Antonio Scala, Andrea Baronchelli, Mat- teo Cinelli, and Walter Quattrociocchi. Persistent inter- action patterns across social media platforms and over time.Nature, 628(8008):582–589, Apr 2024. ISSN 1476-

2024

[20] [20]

URL https://doi

doi:10.1038/s41586-024-07229-y. URL https://doi. org/10.1038/s41586-024-07229-y

work page doi:10.1038/s41586-024-07229-y

[21] [21]

Scrolling through adolescence: a system- atic review of the impact of tiktok on adolescent mental health.European Child & Adolescent Psychiatry, 34(5): 1511–1527, 2025

Giulia Conte, Giorgia Di Iorio, Dario Esposito, Sara Ro- mano, Fabiola Panvino, Susanna Maggi, Benedetta Al- tomonte, Maria Pia Casini, Mauro Ferrara, and Ari- anna Terrinoni. Scrolling through adolescence: a system- atic review of the impact of tiktok on adolescent mental health.European Child & Adolescent Psychiatry, 34(5): 1511–1527, 2025

2025

[22] [22]

Using tiktok for public and youth mental health–a systematic review and content analysis.Clinical child psychology and psy- chiatry, 28(1):279–306, 2023

Darragh McCashin and Colette M Murphy. Using tiktok for public and youth mental health–a systematic review and content analysis.Clinical child psychology and psy- chiatry, 28(1):279–306, 2023

2023

[23] [23]

Interpreting multimodal communication at scale in short-form video: Visual, audio, and textual mental health discourse on tiktok.arXiv preprint arXiv:2601.15278, 2026

Mingyue Zha and Ho-Chun Herbert Chang. Interpreting multimodal communication at scale in short-form video: Visual, audio, and textual mental health discourse on tiktok.arXiv preprint arXiv:2601.15278, 2026

work page arXiv 2026

[24] [24]

i see me here

Ashlee Milton, Leah Ajmani, Michael Ann DeVito, and Stevie Chancellor. “i see me here”: mental health content, community, and algorithmic curation on tiktok. InPro- ceedings of the 2023 CHI conference on human factors in computing systems, pages 1–17, 2023

2023

[25] [25]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for di- mension reduction.arXiv preprint arXiv:1802.03426, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[26] [26]

hdb- scan: Hierarchical density based clustering.The Journal of Open Source Software, 2(11):205, 2017

Leland McInnes, John Healy, and Steve Astels. hdb- scan: Hierarchical density based clustering.The Journal of Open Source Software, 2(11):205, 2017

2017

[27] [27]

Density-based clustering based on hierarchical density estimates

Ricardo JGB Campello, Davoud Moulavi, and Jörg Sander. Density-based clustering based on hierarchical density estimates. InPacific-Asia conference on knowl- edge discovery and data mining, pages 160–172. Springer, 2013

2013

[28] [28]

Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict.Political Analysis, 16(4):372–403, 2008

Burt L Monroe, Michael P Colaresi, and Kevin M Quinn. Fightin’words: Lexical feature selection and evaluation for identifying the content of political conflict.Political Analysis, 16(4):372–403, 2008

2008

[29] [29]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual rep- resentation learning at scale. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors,Pro- ceedings of the 58th Annual Meeting of the A...

work page doi:10.18653/v1/2020.acl-main.747 2020

[30] [30]

Ouellette, Dhiraj Murthy, Ben Pretzer, Tanvi Anand, and Grace Kong

Juhan Lee, Rachel R. Ouellette, Dhiraj Murthy, Ben Pretzer, Tanvi Anand, and Grace Kong. Identifying e- 12 cigarette content on TikTok: Using a BERTopic model- ing approach.Nicotine & Tobacco Research, 27(1):91–96,

[31] [31]

URL https://doi.org/10

doi:10.1093/ntr/ntae171. URL https://doi.org/10. 1093/ntr/ntae171

work page doi:10.1093/ntr/ntae171

[32] [32]

A topic modeling com- parison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts.Frontiers in Sociology, 7: 886498, 2022

Roman Egger and Joanne Yu. A topic modeling com- parison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts.Frontiers in Sociology, 7: 886498, 2022. doi:10.3389/fsoc.2022.886498. URL https: //doi.org/10.3389/fsoc.2022.886498

work page doi:10.3389/fsoc.2022.886498 2022

[33] [33]

Dhiraj Murthy, Simran Keshari, Srishty Arora, Q. Yang, A. Loukas, S. J. Schwartz, M. B. Harrell, E. T. Hébert, and A. V. Wilkinson. Categorizing e-cigarette-related tweets using BERT topic modeling.Emerging Trends in Drugs, Addictions, and Health, 4:100160, 2024. doi: 10.1016/j.etdah.2024.100160. URL https://doi.org/10. 1016/j.etdah.2024.100160

work page doi:10.1016/j.etdah.2024.100160 2024

[34] [34]

Gummadi, and Ingmar Weber

Cai Yang, Sepehr Mousavi, Abhisek Dash, Krishna P. Gummadi, and Ingmar Weber. Studying behavioral addiction by combining surveys and digital traces: A case study of TikTok.Proceedings of the International AAAI Conference on Web and Social Media, 19(1):2106– 2123, 2025. doi:10.1609/icwsm.v19i1.35922. URL https: //doi.org/10.1609/icwsm.v19i1.35922

work page doi:10.1609/icwsm.v19i1.35922 2025

[35] [35]

Towards an automated framework to audit youth safety on Tik- Tok

Linda Xue, Francesco Corso, Nicolo Fontana, Geng Liu, Stefano Ceri, and Francesco Pierri. Towards an automated framework to audit youth safety on Tik- Tok. InProceedings of the Fourth Workshop on Bridg- ing Human-Computer Interaction and Natural Language Processing (HCI+NLP), pages 113–119, Suzhou, China,

[36] [36]

doi: 10.18653/v1/2025.hcinlp-1.9

Association for Computational Linguistics. doi: 10.18653/v1/2025.hcinlp-1.9. URL https://aclanthology. org/2025.hcinlp-1.9/

work page doi:10.18653/v1/2025.hcinlp-1.9 2025

[37] [37]

Nonparametric statistical methods

Myles Hollander, Douglas A Wolfe, and Eric Chicken. Nonparametric statistical methods. John Wiley & Sons, 2013

2013

[38] [38]

Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300, 1995

Yoav Benjamini and Yosef Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B (Methodological), 57(1):289–300, 1995

1995

[39] [39]

am i a sunrise or a sunset?

Allison Degrushe. “am i a sunrise or a sunset?” — understanding the meaning of tiktok’s latest trend,

[40] [40]

URL https://www.distractify.com/p/sunrise-or- sunset-meaning-tiktok

[41] [41]

Americans overestimate how many social me- dia users post harmful content.PNAS nexus, 4(12): pgaf310, 2025

Angela Y Lee, Eric Neumann, Jamil Zaki, and Jeffrey Hancock. Americans overestimate how many social me- dia users post harmful content.PNAS nexus, 4(12): pgaf310, 2025

2025

[42] [42]

Just another hour on tiktok: Id sam- pling to obtain a complete slice of tiktok, 2026

Benjamin Steel, Miriam Schirmer, Derek Ruths, and Juergen Pfeffer. Just another hour on tiktok: Id sam- pling to obtain a complete slice of tiktok, 2026. URL https://journalqd.org/article/view/9514

2026

[43] [43]

Depression is the leading cause of disability around the world.Jama, 317(15):1517–1517, 2017

Mary Jane Friedrich. Depression is the leading cause of disability around the world.Jama, 317(15):1517–1517, 2017

2017

[44] [44]

WHO. Suicide. https://www.who.int/news-room/fact- sheets/detail/suicide, 2025. [Accessed 14-01-2026]. Appendix A: Hyperparameters used for topic modeling Sentence Embeddings Embedding modelall-mpnet-base-v2 UMAP Neighbors (k) 20 Embedding dim. 10 (2 for visualization) Min. distance 0.05 Training epochs 50,000 Distance metric cosine HDBSCAN Min. cluster si...

2025