Supervised and semi-supervised sense disambiguation methods will mis-tag the instances of a target word if the senses of these instances are not defined in sense inventories or there are no tagged instances for these senses in training data. Here we used a model order identification method to avoid the misclassification of the instances with undefined senses by discovering new senses from mixed data (tagged and untagged corpora). This algorithm tries to obtain a natural partition of the mixed data by maximizing a stability criterion defined on the classification result from an extended label propagation algorithm over all the possible values of the number of senses (or sense number, model order). Experimental results on SENSEVAL-3 data indicate that it outperforms SVM, a one-class partially supervised classification algorithm, and a clustering based model order identification algorithm when the tagged data is incomplete. © 2006 Association for Computational Linguistics.
CITATION STYLE
Niu, Z. Y., Ji, D. H., & Tan, C. L. (2006). Partially supervised sense disambiguation by learning sense number from tagged and untagged corpora. In COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 415–422). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1610075.1610134
Mendeley helps you to discover research relevant for your work.