A rough set-based approach to text classification

Alexios Chouchoulas; Qiang Shen

Conference Proceedings

A rough set-based approach to text classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1999) 1711 118-127

DOI: 10.1007/978-3-540-48061-7_16

24Citations

17Readers

Get full text

Abstract

A non-trivial obstacle in good text classification for information filtering and retrieval (IF/IR) is the dimensionality of the data. This paper proposes a technique using Rough Set Theory to alleviate this situation. Given corpora of documents and a training set of examples of classified documents, the technique locates a minimal set of coordinate keywords to distinguish between classes of documents, reducing the dimensionality of the keyword vectors. This simplifies the creation of knowledge-based IF/IR systems, speeds up their operation, and allows easy editing of the rule bases employed. The paper describes the proposed technique, discusses the integration of a keyword acquisition algorithm with a rough set-based dimensionality reduction algorithm, and provides experimental results of a proof-of-concept implementation.

Cite

CITATION STYLE

APA

Chouchoulas, A., & Shen, Q. (1999). A rough set-based approach to text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1711, pp. 118–127). Springer Verlag. https://doi.org/10.1007/978-3-540-48061-7_16

A rough set-based approach to text classification

Abstract

Cite

Register to see more suggestions