Feature Selection and Reduction for Persian Text Classification

Zahra Robati; Morteza Zahedi; Najmeh Fayazi Far

Journal ArticleOPEN ACCESS

Feature Selection and Reduction for Persian Text Classification

Robati Z
Zahedi M
Fayazi Far N

International Journal of Computer Applications (2015) 109(17) 1-5

DOI: 10.5120/19414-9005

N/ACitations

7Readers

Abstract

With the rapid growth of the World Wide Web and increasing availability of electronic documents, the automatic text classification became a general and important machine learning problem in text mining domain. In text classification, feature selection is used for reducing the size of feature vector and for improving the performance of classifier. This paper improved Dominance which is a feature selection criterion and proposed Extended Dominance (E-Dominance) as a new criterion. E-Dominance is compared favorably with usual feature selection methods based on document frequency (DF), information gain (IG), Entropy, χ2 and Dominance on a collection of XML documents from Hamshahri2 which is a commonly used in Persian text classification. The comparative study confirms the effectiveness of proposed feature selection criterion derived from the Dominance.

Cite

CITATION STYLE

APA

Robati, Z., Zahedi, M., & Fayazi Far, N. (2015). Feature Selection and Reduction for Persian Text Classification. International Journal of Computer Applications, 109(17), 1–5. https://doi.org/10.5120/19414-9005

Feature Selection and Reduction for Persian Text Classification

Abstract

Cite

Register to see more suggestions