A word-to-word similarity function automatically extracted from a corpus of texts can be a very helpful tool in automatic extraction of lexical semantic relations. There are many approaches for English, but only a few for inflective languages with almost free word order. In the paper a method for the construction of a similarity function for Polish nouns is proposed. The method uses only simple tools for language processing (e.g. it does need the application of a parser). The core is the construction of a matrix of co-occurrences of nouns and adjectives on the basis of application of morpho-syntactic constraints testing agreement between an adjective and a noun. Several methods of transformation of the matrix and calculation of the similarity function are presented. The achieved accuracy of 81.15% in WordNet-based Synonymy Test (for 4 611 Polish nouns, using the current version of Polish WordNet) seems to be comparable with the best results reported for English (e.g. 75.8% [5]). © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Piasecki, M., & Broda, B. (2007). Semantic similarity measure of polish nouns based on linguistic features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4439 LNCS, pp. 381–390). Springer Verlag. https://doi.org/10.1007/978-3-540-72035-5_29
Mendeley helps you to discover research relevant for your work.