CANELC: Constructing an e-language Corpus

  • Knight D
  • Adolphs S
  • Carter R
  • 20


    Mendeley users who have this article in their library.
  • 8


    Citations of this article.


This paper reports on the construction of the Cambridge and Nottingham e-language Corpus (CANELC).33This corpus has been built as part of a collaborative project between the University of Nottingham and Cambridge University Press with whom sole copyright of the annotated corpus resides. CANELC comprises one-million words of digital English taken from SMS messages, blogs, Tweets, discussion board content and private/business e-mails. Plans to extend the corpus are under discussion. The legal dimension to corpus ‘ownership’ of some forms of unannotated data is a complex one and is under constant review. At present, the annotated corpus is only available to authors and researchers working for CUP and is not more generally available. CANELC is a one-million word corpus of digital communication in English, taken from online discussion boards, blogs, tweets, e-mails and Short Message Services (SMS). The paper outlines the approaches used when planning the corpus: obtaining consent, collecting the data and compi...

Author-supplied keywords

  • Blogs
  • Corpus linguistics
  • Discussion boards
  • E-language
  • SMS
  • Tweets

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free