Counting lumps in word space: Density as a measure of corpus homogeneity

Magnus Sahlgren; Jussi Karlgren

Conference Proceedings

Counting lumps in word space: Density as a measure of corpus homogeneity

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3772 LNCS 151-154

DOI: 10.1007/11575832_16

5Citations

10Readers

Get full text

Abstract

This paper introduces a measure of corpus homogeneity that indicates the amount of topical dispersion in a corpus. The measure is based on the density of neighborhoods in semantic word spaces. We evaluate the measure by comparing the results for five different corpora. Our initial results indicate that the proposed density measure can indeed identify differences in topical dispersion. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Sahlgren, M., & Karlgren, J. (2005). Counting lumps in word space: Density as a measure of corpus homogeneity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3772 LNCS, pp. 151–154). https://doi.org/10.1007/11575832_16

Counting lumps in word space: Density as a measure of corpus homogeneity

Abstract

Cite

Register to see more suggestions