Arabic Language Text Classification Using Dependency Syntax-Based Feature Selection

Philippe Lenca; Yannis Haralambous; Yassir Elidrissi

arxiv: 1410.4863 · v1 · pith:GSSH2MTFnew · submitted 2014-10-17 · 💻 cs.CL

Arabic Language Text Classification Using Dependency Syntax-Based Feature Selection

Yannis Haralambous , Yassir Elidrissi , Philippe Lenca This is my paper

classification 💻 cs.CL

keywords featuretextarabicbetterclassificationdependencyselectionassociation

0 comments

read the original abstract

We study the performance of Arabic text classification combining various techniques: (a) tfidf vs. dependency syntax, for feature selection and weighting; (b) class association rules vs. support vector machines, for classification. The Arabic text is used in two forms: rootified and lightly stemmed. The results we obtain show that lightly stemmed text leads to better performance than rootified text; that class association rules are better suited for small feature sets obtained by dependency syntax constraints; and, finally, that support vector machines are better suited for large feature sets based on morphological feature selection criteria.

This paper has not been read by Pith yet.

Arabic Language Text Classification Using Dependency Syntax-Based Feature Selection

discussion (0)