Towards a more Robust Evaluation for Conversational Question Answering

10Citations
Citations of this article
65Readers
Mendeley users who have this article in their library.

Abstract

With the explosion of chatbot applications, Conversational Question Answering (CQA) has generated a lot of interest in recent years. Among proposals, reading comprehension models which take advantage of the conversation history (previous QA) seem to answer better than those which only consider the current question. Nevertheless, we note that the CQA evaluation protocol has a major limitation. In particular, models are allowed, at each turn of the conversation, to access the ground truth answers of the previous turns. Not only does this severely prevent their applications in fully autonomous chatbots, it also leads to unsuspected biases in their behavior. In this paper, we highlight this effect and propose new tools for evaluation and training in order to guard against the noted issues. The new results that we bring come to reinforce methods of the current state of the art.

Cite

CITATION STYLE

APA

Siblini, W., Sayil, B., & Kessaci, Y. (2021). Towards a more Robust Evaluation for Conversational Question Answering. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 1028–1034). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-short.130

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free