A semantic kernel to classify texts with very few training examples

ISSN: 03505596
24Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

Advanced techniques to access the information distributed on the Web often exploit automatic text categorization to filter out irrelevant data before activating specific searching procedures. The drawback of such approach is the need of a large number of training documents to train the target classifiers. One way to reduce such number relates to the use of more effective document similarities based on prior knowledge. Unfortunately, previous work has shown that such information (e.g. WordNet) causes the decrease of retrieval accuracy. In this paper, we propose kernel functions to use prior knowledge in learning algorithms for document classification. Such kernels implement balanced and statistically coherent document similarities in a vector space by means of the term similarity based on the WordNet hierarchy. Cross-validation results show the benefit of the approach for Support Vector Machines when few training examples are available.

Cite

CITATION STYLE

APA

Basili, R., Cammisa, M., & Moschitti, A. (2006). A semantic kernel to classify texts with very few training examples. Informatica (Ljubljana), 30(2), 163–172.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free