Question Answering is a longevous field in computer science, aimed at realizing systems able to answer questions expressed in natural language. However, building Question Answering systems for Italian and able to extract answers from a corpus pertaining a closed domain is still an open research problem. Indeed, extracting clues from a question to generate a query for the information retrieval engine as well as determining the likelihood that a candidate answer is correct are two very thorny tasks. To face these issues, the paper presents a Question Answering pipeline for Italian and based on a corpus of documents pertaining a closed domain. In particular, this pipeline exhibits functionalities for: (i) analyzing natural language questions in Italian by using lexical features; (ii) handling both factoid and description answer types and, depending on them, filtering contextual stop words from questions; (iii) scoring and selecting candidate answers with respect to their type in order to determine the best one. The proposed solution has been subject to an evaluation of its performance using standard metrics, showing promising results.
CITATION STYLE
Damiano, E., Spinelli, R., Esposito, M., & de Pietro, G. (2018). An effective corpus-based question answering pipeline for Italian. In Smart Innovation, Systems and Technologies (Vol. 76, pp. 80–90). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-59480-4_9
Mendeley helps you to discover research relevant for your work.