In this paper, we introduce a new semi-supervised learning model for word sense disambiguation based on Kernel Principal Component Analysis (KPCA), with experiments showing that it can further improve accuracy over supervised KPCA models that have achieved WSD accuracy superior to the best published individual models. Although empirical results with supervised KPCA models demonstrate significantly better accuracy compared to the state-of-the-art achieved by either naïve Bayes or maximum entropy models on Senseval-2 data, we identify specific sparse data conditions under which supervised KPCA models deteriorate to essentially a most-frequent-sense predictor. We discuss the potential of KPCA for leveraging unannotated data for partially-unsupervised training to address these issues, leading to a composite model that combines both the supervised and semi-supervised models.
CITATION STYLE
Su, W., Carpuat, M., & Wu, D. (2004). Semi-supervised training of a kernel PCA-based model for word sense disambiguation. In COLING 2004 - Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220355.1220545
Mendeley helps you to discover research relevant for your work.