Identifying rare classes with sparse training data

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Building models and learning patterns from a collection of data are essential tasks for decision making and dissemination of knowledge. One of the common tools to extract knowledge is to build a classifier. However, when the training dataset is sparse, it is difficult to build an accurate classifier. This is especially true in biological science, as biological data are hard to produce and error-prone. Through empirical results, this paper shows challenges in building an accurate classifier with a sparse biological training dataset. Our findings indicate the inadequacies in well known classification techniques. Although certain clustering techniques, such as seeded k-Means, show some promise, there are still spaces for further improvement. In addition, we propose a novel idea that could be used to produce more balanced classifier when training data samples are very limited. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Zhang, M., Jiang, W., Clifton, C., & Prabhakar, S. (2007). Identifying rare classes with sparse training data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4653 LNCS, pp. 751–760). Springer Verlag. https://doi.org/10.1007/978-3-540-74469-6_73

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free