Rough set feature selection algorithms for textual case-based classification

Kalyan Moy Gupta; David W. Aha; Philip Moore

Conference Proceedings

Rough set feature selection algorithms for textual case-based classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4106 LNAI 166-181

DOI: 10.1007/11805816_14

14Citations

8Readers

Get full text

Abstract

Feature selection algorithms can reduce the high dimensionality of textual cases and increase case-based task performance. However, conventional algorithms (e.g., information gain) are computationally expensive. We previously showed that, on one dataset, a rough set feature selection algorithm can reduce computational complexity without sacrificing task performance. Here we test the generality of our findings on additional feature selection algorithms, add one data set, and improve our empirical methodology. We observed that features of textual cases vary in their contribution to task performance based on their part-of-speech, and adapted the algorithms to include a part-of-speech bias as background knowledge. Our evaluation shows that injecting this bias significantly increases task performance for rough set algorithms, and that one of these attained significantly higher classification accuracies than information gain. We also confirmed that, under some conditions, randomized training partitions can dramatically reduce training times for rough set algorithms without compromising task performance. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Gupta, K. M., Aha, D. W., & Moore, P. (2006). Rough set feature selection algorithms for textual case-based classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4106 LNAI, pp. 166–181). Springer Verlag. https://doi.org/10.1007/11805816_14

Rough set feature selection algorithms for textual case-based classification

Abstract

Cite

Register to see more suggestions