pith. sign in

arxiv: 2410.21554 · v4 · pith:5AETDUJ7new · submitted 2024-10-28 · 💻 cs.SI

How the cascade inference problem distorts information diffusion

classification 💻 cs.SI
keywords informationcascadediffusiondatareconstructionanalysesanalyzecascades
0
0 comments X
read the original abstract

To analyze the flow of information online, experts often rely on platform-provided data from social media companies, which typically attribute all resharing actions to an original poster. This obscures the true dynamics of how information spreads online, as users can be exposed to content in various ways. While most researchers analyze data as it is provided by the platform and overlook this issue, some attempt to infer the structure of information cascades. However, the absence of ground truth about actual diffusion cascades makes it impossible to verify the efficacy of these efforts. We propose a novel parametric reconstruction approach and use it to investigate how overlooking cascade reconstruction distorts analyses of social influence, community detection, and information diffusion. Two case studies involving data from Twitter and Bluesky reveal that cascade inference significantly impacts the identification of both influential users and communities, therefore affecting downstream analyses in general. Analysis of the diffusion of over 40,000 true and false news stories on Twitter reveals that the assumptions made during the reconstruction procedure drastically distort both microscopic and macroscopic properties of cascade networks. This work highlights the challenges of studying information spreading processes on complex networks and has significant implications for the broader study of digital platforms.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.