Abstract
In this paper, I introduce methodologies to tap corpora for exploring aggregate linguistic distances between dialects or varieties as a function of properties of geographic space. The paper describes the different steps necessary to obtain an appropriate corpus-based dataset (a so-called 'distance matrix'), and subsequently discusses several cartographic visualisation techniques - network maps, continuum maps and cluster maps - to project aggregate linguistic relationships to geography. In addition, the paper sketches some statistical methods to quantify these relationships. By way of example, a case study draws on the Freiburg Corpus of English Dialects - a major dialect corpus in which more than thirty traditional dialects of English from all over Great Britain are sampled. With a focus on regional variation in morphosyntax and on the basis of text frequencies of several dozen features, the study probes joint linguistic variability between the dialects sampled in the corpus. © Edinburgh University Press.
Cite
CITATION STYLE
Szmrecsanyi, B. (2011). Corpus-based dialectometry: A methodological sketch. Corpora, 6(1), 45–76. https://doi.org/10.3366/cor.2011.0004
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.