Active learning with sampling by uncertainty and density for word sense disambiguation and text classification

153Citations
Citations of this article
192Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper addresses two issues of active learning. Firstly, to solve a problem of uncertainty sampling that it often fails by selecting outliers, this paper presents a new selective sampling technique, sampling by uncertainty and density (SUD), in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier. Secondly, a technique of sampling by clustering (SBC) is applied to build a representative initial training data set for active learning. Finally, we implement a new algorithm of active learning with SUD and SBC techniques. The experimental results from three real-world data sets show that our method outperforms competing methods, particularly at the early stages of active learning. © 2008. Licensed under the Creative Commons.

Cite

CITATION STYLE

APA

Zhu, J., Wang, H., Yao, T., & Tsou, B. K. (2008). Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 1137–1144). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1599081.1599224

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free