Abstract
In this paper, an extension of a dimensionality reduction algorithm called NON-NEGATIVE MATRIX FACTORIZATION is presented that combines both 'bag of words' data and syntactic data, in order to find semantic dimensions according to which both words and syntactic relations can be classified. The use of three way data allows one to determine which dimension(s) are responsible for a certain sense of a word, and adapt the corresponding feature vector accordingly, 'subtracting' one sense to discover another one. The intuition in this is that the syntactic features of the syntax-based approach can be disambiguated by the semantic dimensions found by the bag of words approach. The novel approach is embedded into clustering algorithms, to make it fully automatic. The approach is carried out for Dutch, and evaluated against EuroWordNet. © 2008. Licensed under the Creative Commons.
Cite
CITATION STYLE
Van De Cruys, T. (2008). Using three way data for word sense discrimination. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 929–936). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1599081.1599198
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.