Supervised term weights for biomedical text classification: Improvements in nearest centroid computation

1Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Maintaining accessibility of biomedical literature databases has led to development of text classification systems to assist human indexers by recommending thematic categories to biomedical articles. These systems rely on using machine learning methods to learn the association between the document terms and predefined categories. The accuracy of a text classification method depends on the metric used in order to assign a weight to each term. Weighting metrics can be classified as supervised or unsupervised according to whether they use prior information on the number of documents belonging to each category. In this paper, we propose two supervised weighting metrics (One-way Klosgen and Loevinger) which both improve the quality of biomedical document classification. We also show that by using moment generating function centroids, an alternative to the traditional arithmetical average centroids, a nearest centroid classifier with Loevinger metric performs significantly better than SVM on a biomedical text classification task.

Cite

CITATION STYLE

APA

Haddoud, M., Mokhtari, A., Lecroq, T., & Abdeddaïm, S. (2016). Supervised term weights for biomedical text classification: Improvements in nearest centroid computation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9874 LNCS, pp. 98–113). Springer Verlag. https://doi.org/10.1007/978-3-319-44332-4_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free