pith. sign in

arxiv: 0712.2491 · v1 · submitted 2007-12-15 · ⚛️ physics.soc-ph

Co-occurrence Network of Reuters News

classification ⚛️ physics.soc-ph
keywords networkalgorithmsco-occurrencedistributionfindimportanceincludingnews
0
0 comments X
read the original abstract

Networks describe various complex natural systems including social systems. We investigate the social network of co-occurrence in Reuters-21578 corpus, which consists of news articles that appeared in the Reuters newswire in 1987. People are represented as vertices and two persons are connected if they co-occur in the same article. The network has small-world features with power-law degree distribution. The network is disconnected and the component size distribution has power law characteristics. Community detection on a degree-reduced network provides meaningful communities. An edge-reduced network, which contains only the strong ties has a star topology. "Importance" of persons are investigated. The network is the situation in 1987. After 20 years, a better judgment on the importance of the people can be done. A number of ranking algorithms, including Citation count, PageRank, are used to assign ranks to vertices. The ranks given by the algorithms are compared against how well a person is represented in Wikipedia. We find up to medium level Spearman's rank correlations. A noteworthy finding is that PageRank consistently performed worse than the other algorithms. We analyze this further and find reasons.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.