Feature subset selection in text-learning

Dunja Mladenić

Conference ProceedingsOPEN ACCESS

Feature subset selection in text-learning

Mladenić D

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1998) 1398 95-100

DOI: 10.1007/bfb0026677

79Citations

45Readers

Abstract

This paper describes several known and some new methods for feature subset selection on large text data. Experimental comparison given on real-world data collected from Web users shows that characteristics of the problem domain and machine learning algorithm should be considered when feature scoring measure is selected. Our problem domain consists of hyperlinks given in a form of small-documents represented with word vectors. In our learning experiments naive Bayesian classifier was used on text data. The best performance was achieved by the feature selection methods based on the feature scoring measure called Odds ratio that is known from information retrieval.

Cite

CITATION STYLE

APA

Mladenić, D. (1998). Feature subset selection in text-learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1398, pp. 95–100). Springer Verlag. https://doi.org/10.1007/bfb0026677

Feature subset selection in text-learning

Abstract

Cite

Register to see more suggestions