Datasets, Corpora and other Language Resources

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This chapter provides an overview of what is available in ELG in terms of datasets, corpora and other language resources (LRs) and how this has been achieved. We look at the procedures and steps that have been followed to complete the full resource ingestion cycle, which goes from repository and LR identification to metadata description and ingestion. We explain the approaches, priorities and methodology. The chapter also outlines the repositories that have been integrated into ELG, discussing the different procedures followed (metadata conversion, extraction, and completion, as well as harvesting) and the reasons behind these choices. Furthermore, the ELG catalogue content is described, with details on key elements and features as well as accomplishments. The last two sections are devoted to the crucial legal issues behind such a complex platform and its data management plan, respectively.

Cite

CITATION STYLE

APA

Arranz, V., Choukri, K., Mapelli, V., Rigault, M., Labropoulou, P., Deligiannis, M., … Piperidis, S. (2023). Datasets, Corpora and other Language Resources. In Cognitive Technologies (pp. 151–169). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-17258-8_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free