In the recognition of words that are typical of a specific language variety, the classic keyword approach performs rather poorly. We show how this keyword analysis can be complemented with a word space model constructed on the basis of two corpora: one representative of the language variety under investigation, and a reference corpus. This combined approach is able to recognize the markers of a language variety as words that not only have a significantly higher frequency as compared to the reference corpus, but also a different distribution. The application of word space models moreover makes it possible to automatically discover the lexical alternative to a specific marker in the reference corpus.
CITATION STYLE
Peirsman, Y., & Speelman, D. (2009). Word Space Models of Lexical Variation. In Proceedings of the EACL 2009 Workshop on GEMS: GEometrical Models of Natural Language Semantics, GEMS 2009 (pp. 9–16). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1705415.1705417
Mendeley helps you to discover research relevant for your work.