Subjective annotation and evaluation of three different chatbots WOCHAT: Shared task report

4Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper evaluates the performance of three different chatbots: IRIS, TickTock and Joker, that have been made available to the public online. All three retrieval-based dialogue systems are chat-oriented and designed to engage the users into all types of conversations for as long as possible. They employ different approaches to provide relevant and valid responses, and constantly utilize conversational strategies to further automatically improve its own system through machine learning. The analysis of annotations of more than 2000 responses for the three chatbots allowed us to confirm the robustness, scalability and usability of the systems, as well as to detect a few areas in which response accuracy was lacking, and propose future work to further improve the three systems and annotations scheme.

Cite

CITATION STYLE

APA

Kong-Vega, N., Shen, M., Wang, M., & D’Haro, L. F. (2019). Subjective annotation and evaluation of three different chatbots WOCHAT: Shared task report. In Lecture Notes in Electrical Engineering (Vol. 579, pp. 371–378). Springer. https://doi.org/10.1007/978-981-13-9443-0_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free