This paper introduces a measure of corpus homogeneity that indicates the amount of topical dispersion in a corpus. The measure is based on the density of neighborhoods in semantic word spaces. We evaluate the measure by comparing the results for five different corpora. Our initial results indicate that the proposed density measure can indeed identify differences in topical dispersion. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Sahlgren, M., & Karlgren, J. (2005). Counting lumps in word space: Density as a measure of corpus homogeneity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3772 LNCS, pp. 151–154). https://doi.org/10.1007/11575832_16
Mendeley helps you to discover research relevant for your work.