A central problem in good text classification for IF/IR is the high dimensionality of the data. To cope with this problem, we propose a technique using Rough Sets theory to alleviate this situation. Given corpora of documents and a training set of examples of classified documents, the technique locates a minimal set of co-ordinate keywords to distinguish between classes of documents, reducing the dimensionality of the keyword vectors. Besides, we generate several reduct bases for the classification of new object, hoping that the combination of answers of the multiple reduct bases result in better performance. To get the tidy and effective rules, we use the value reduction as the final rules. This paper describes the proposed technique and provides experimental results. © Springer-Verlag 2003.
CITATION STYLE
Bao, Y., Asai, D., Du, X., Yamada, K., & Ishii, N. (2004). An effective rough set-based method for text classification. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2690, 545–552. https://doi.org/10.1007/978-3-540-45080-1_75
Mendeley helps you to discover research relevant for your work.