The paper deals with the linguistic problem of fully automatic grouping of semantically related words. We discuss the measures of semantic relatedness of basic word forms and describe the treatment of collocations. Next we present the procedure of hierarchical clustering of a very large number of semantically related words and give examples of the resulting partitioning of data in the form of dendrogram. Finally we show a form of the output presentation that facilitates the inspection of the resulting word clusters.
CITATION STYLE
Smrž, P., & Rychlý, P. (2001). Finding semantically related words in large corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2166, pp. 108–115). Springer Verlag. https://doi.org/10.1007/3-540-44805-5_14
Mendeley helps you to discover research relevant for your work.