We propose a fast and scalable method for semi-supervised learning of sequence models, based on anchor words and moment matching. Our method can handle hidden Markov models with feature-based log-linear emissions. Unlike other semi-supervised methods, no decoding passes are necessary on the unlabeled data and no graph needs to be constructed-only one pass is necessary to collect moment statistics. The model parameters are estimated by solving a small quadratic program for each feature. Experiments on part-of-speech (POS) tagging for Twitter and for a low-resource language (Malagasy) show that our method can learn from very few annotated sentences.
CITATION STYLE
Marinho, Z., Martins, A. F. T., Cohen, S. B., & Smith, N. A. (2016). Semi-supervised learning of sequence models with the method of moments. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 287–296). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d16-1028
Mendeley helps you to discover research relevant for your work.