Model selection (picking, for example, the feature set and the regularization strength) is crucial for building high-accuracy NLP models. In supervised learning, we can estimate the accuracy of a model on a subset of the labeled data and choose the model with the highest accuracy. In contrast, here we focus on type-supervised learning, which uses constraints over the possible labels for word types for supervision, and labeled data is either not available or very small. For the setting where no labeled data is available, we perform a comparative study of previously proposed and one novel model selection criterion on type-supervised POS-tagging in nine languages. For the setting where a small labeled set is available, we show that the set should be used for semi-supervised learning rather than for model selection only – using it for model selection reduces the error by less than 5%, whereas using it for semi-supervised learning reduces the error by 44%.
CITATION STYLE
Toutanova, K., Ammar, W., Chourdhury, P., & Poon, H. (2015). Model selection for type-supervised learning with application to POS tagging. In CoNLL 2015 - 19th Conference on Computational Natural Language Learning, Proceedings (pp. 332–337). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/k15-1036
Mendeley helps you to discover research relevant for your work.