Too big or not too big: Establishing the minimum size for a legal ad hoc corpus

8Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

A corpus can be described as "[a] collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis" (Francis 1982). However, the concept of representativeness is still surprisingly imprecise considering its acceptance as a central characteristic that distinguishes a corpus from any other kind of collection (Seghiri 2008). In fact, there is no general agreement as to what the size of a corpus should ideally be. In practice, however, "the size of a corpus tends to refl ect the ease or diffi culty of acquiring the material" (Giouli/Piperidis 2002). For this reason, in this paper we will attempt to deal with this key question: we will focus on the complex notion of representativeness and ideal size for ad hoc corpora, from both a theoretical and an applied perspective and we will describe a computer application named ReCor that will be used to verify whether a sample of legal contracts compiled might be considered representative from the quantitative point of view.

Cite

CITATION STYLE

APA

Seghiri, M. (2015). Too big or not too big: Establishing the minimum size for a legal ad hoc corpus. Hermes (Denmark), 53, 85–98. https://doi.org/10.7146/hjlcb.v27i53.20981

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free