Error annotation of the Arabic learner corpus: A new error tagset

Abdullah Alfaifi; Eric Atwell; Ghazi Abuhakema

Conference Proceedings

Error annotation of the Arabic learner corpus: A new error tagset

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8105 LNAI 14-22

DOI: 10.1007/978-3-642-40722-2_2

4Citations

9Readers

Get full text

Abstract

This paper introduces a new two-level error tagset, AALETA (Alfaifi Atwell Leeds Error Tagset for Arabic), to be used for annotating the Arabic Learner Corpora (ALC). The new tagset includes six broad classes, subdivided into 37 more specific error types or subcategories. It is easily understood by Arabic corpus error annotators. AALEETA is based on an existing error tagset for Arabic corpora, ARIDA, created by Abuhakema et al. [1], and a number of other error-analysis studies. It was used to annotate texts of the Arabic Learner Corpus [2]. The paper shows the tagset broad classes and types or subcategories and an example of annotation. The understandability of AALETA was measured against that of ARIDA, and the preliminary results showed that AALETA achieved a slightly higher score. Annotators reported that they preferred using AALETA over ARIDA. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Alfaifi, A., Atwell, E., & Abuhakema, G. (2013). Error annotation of the Arabic learner corpus: A new error tagset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8105 LNAI, pp. 14–22). https://doi.org/10.1007/978-3-642-40722-2_2

Error annotation of the Arabic learner corpus: A new error tagset

Abstract

Author supplied keywords

Cite

Register to see more suggestions