In this paper, we describe the development of TALAA-AFAQ, a Corpus of Arabic Factoid Question Answers that is developed to be used in the training modules of an Arabic Question Answering System (AQAS). The process of building our corpus consists of five steps, in which we extract syntactic, semantic features and other information. In addition, we extract a set of answer patterns for each question from the web. The corpus contains 2002 question answer pairs. Out of these, 618 question-answer pairs have their answer-patterns. The corpus is divided into four main classes and 34 finer categories. All answer patterns and features have been validated by experts on Arabic. To the best of our knowledge, this is the first corpus of Arabic Factoid Question Answers which is specifically built to support the development of Arabic QASs (AQAS).
CITATION STYLE
Aouichat, A., & Guessoum, A. (2017). Building TALAA-AFAQ, a corpus of arabic factoid question-answers for a question answering system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10260 LNCS, pp. 380–386). Springer Verlag. https://doi.org/10.1007/978-3-319-59569-6_46
Mendeley helps you to discover research relevant for your work.