pith. sign in

arxiv: 1410.6830 · v1 · pith:QX7PTP2Jnew · submitted 2014-10-24 · 💻 cs.CL · cs.LG

Clustering Words by Projection Entropy

classification 💻 cs.CL cs.LG
keywords textentropywordsapplyprocedureprojectionagglomerationagglomerative
0
0 comments X
read the original abstract

We apply entropy agglomeration (EA), a recently introduced algorithm, to cluster the words of a literary text. EA is a greedy agglomerative procedure that minimizes projection entropy (PE), a function that can quantify the segmentedness of an element set. To apply it, the text is reduced to a feature allocation, a combinatorial object to represent the word occurences in the text's paragraphs. The experiment results demonstrate that EA, despite its reduction and simplicity, is useful in capturing significant relationships among the words in the text. This procedure was implemented in Python and published as a free software: REBUS.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.