A sequential algorithm for training text classifiers

David D. Lewis; William A. Gale

Conference Proceedings

A sequential algorithm for training text classifiers

Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994 (1994) 3-12

DOI: 10.1007/978-1-4471-2099-5_1

1.8kCitations

579Readers

Get full text

Abstract

The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced by as much as 500-fold the amount of training data that would have to be manually classified to achieve a given level of effectiveness.

Cite

CITATION STYLE

APA

Lewis, D. D., & Gale, W. A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994 (pp. 3–12). Association for Computing Machinery, Inc. https://doi.org/10.1007/978-1-4471-2099-5_1

A sequential algorithm for training text classifiers

Abstract

Cite

Register to see more suggestions