pith. sign in

arxiv: 2505.15370 · v5 · pith:YUSLKH7Dnew · submitted 2025-05-21 · 💻 cs.SI

The Significance of User Characteristics for Reposting Prediction on X: A Comparative Analysis Under Distribution Shift

classification 💻 cs.SI
keywords repostinguserfeaturestopicsbehaviourpredictionacrosscharacteristics
0
0 comments X
read the original abstract

Understanding information diffusion on X (formerly Twitter) requires accurate modelling of reposting behaviour. Most existing work predicts reposting under in-distribution settings, where training and test data cover the same topics. This paper addresses a more realistic and challenging scenario: out-of-distribution prediction, i.e., forecasting reposting behaviour for new, previously unseen topics. We formulate the task at the individual level - predicting whether a specific user will repost a given post - and systematically compare the predictive power of post-related features, user-related features, and their combination across four representative models: Decision Tree, Multi-Layer Perceptron, BERT, and Qwen. Our experiments show that while post-related features perform well in-distribution, their performance declines drastically for unseen topics, with F1 scores falling to approximately 0.12. In contrast, user-related features - including user profiles, social relations, and historical behaviour - deliver strong and transferable performance, raising the F1 score to over 0.70. These results demonstrate that reposting decisions are largely content-agnostic: they are driven more by stable user characteristics than by the specific content of a post. Our findings highlight the value of user modelling for building robust prediction systems and provide new insights into the mechanisms that enable information to spread across different topics.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.