This chapter provides an overview of what is available in ELG in terms of datasets, corpora and other language resources (LRs) and how this has been achieved. We look at the procedures and steps that have been followed to complete the full resource ingestion cycle, which goes from repository and LR identification to metadata description and ingestion. We explain the approaches, priorities and methodology. The chapter also outlines the repositories that have been integrated into ELG, discussing the different procedures followed (metadata conversion, extraction, and completion, as well as harvesting) and the reasons behind these choices. Furthermore, the ELG catalogue content is described, with details on key elements and features as well as accomplishments. The last two sections are devoted to the crucial legal issues behind such a complex platform and its data management plan, respectively.
CITATION STYLE
Arranz, V., Choukri, K., Mapelli, V., Rigault, M., Labropoulou, P., Deligiannis, M., … Piperidis, S. (2023). Datasets, Corpora and other Language Resources. In Cognitive Technologies (pp. 151–169). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-17258-8_8
Mendeley helps you to discover research relevant for your work.