In this paper a family of rule learners whose application is carried out according to a partial-matching criterion based on different purity measures is presented. The behavior of these rule learners is tested by solving a Text Categorisation problem. To illustrate the advantages of each learner, the MDL-based method of C4-5 is replaced by a pruning process whose performance relies on an estimation of the quality of the rules. Empirical results show that, in general, inducing partial-matching rules yields more compact rule sets without degrading performance mea-sured in terms of microaveraged F1 which is one of the most common performance measure in Information Retrieval tasks. The experiments show that there are some purity measures which produces a number of rules significantly lesser than C4-5 meanwhile the performance measured with F1 is not degraded. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Ranilla, J., Díaz, I., & Fernández, J. (2003). Text categorisation using a partial-matching strategy. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2686, 262–269. https://doi.org/10.1007/3-540-44868-3_34
Mendeley helps you to discover research relevant for your work.