Abstract
The common practice of testing a sequence of text classifiers learned on a growing training set, and stopping when a target value of estimated effectiveness is first met, introduces a sequential testing bias. In settings where the effectiveness of a text classifier must be certified (perhaps to a court of law), this bias may be unacceptable. The choice of when to stop training is made even more complex when, as is common, the annotation of training and test data must be paid for from a common budget: each new labeled training example is a lost test example. Drawing on ideas from statistical power analysis, we present a framework for joint minimization of training and test annotation that maintains the statistical validity of effectiveness estimates, and yields a natural definition of an optimal allocation of annotations to training and test data. We identify the development of allocation policies that can approximate this optimum as a central question for research. We then develop simulation-based power analysis methods for van Rijsbergen's F-measure, and incorporate them in four baseline allocation policies which we study empirically. In support of our studies, we develop a new analytic approximation of confidence intervals for the F-measure that is of independent interest. Copyright is held by the author/owner(s).
Author supplied keywords
Cite
CITATION STYLE
Bagdouri, M., Webber, W., Lewis, D. D., & Oard, D. W. (2013). Towards minimizing the annotation cost of certified text classification. In International Conference on Information and Knowledge Management, Proceedings (pp. 989–998). https://doi.org/10.1145/2505515.2505708
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.