Evaluation of Statistical POMDP-Based Dialogue Systems in Noisy Environments

  • Young S
  • Breslin C
  • Gašić M
  • et al.
N/ACitations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Compared to conventional hand-crafted rule-based dialogue management systems, statistical POMDP-based dialogue managers offer the promise of increased robustness, reduced development and maintenance costs, and scaleability to large open-domains. As a consequence, there has been considerable research activity in approaches to statistical spoken dialogue systems over recent years. However, build- ing and deploying a real-time spoken dialogue system is expensive, and even when operational, it is hard to recruit sufficient users to get statistically significant results. Instead, researchers have tended to evaluate using user simulators or by reprocess- ing existing corpora, both of which are unconvincing predictors of actual real world performance. This paper describes the deployment of a real-world restaurant in- formation system and its evaluation in a motor car using subjects recruited locally and by remote users recruited using Amazon Mechanical Turk. The paper explores three key questions: are statistical dialogue systems more robust than conventional hand-crafted systems; how does the performance of a system evaluated on a user simulator compare to performance with real users; and can performance of a system tested over the telephone network be used to predict performance in more hostile environments such as a motor car? The results show that the statistical approach is indeed more robust, but results from a simulator significantly over-estimate per- formance both absolute and relative. Finally, by matching WER rates, performance results obtained over the telephone can provide useful predictors of performance in noisier environments such as the motor car, but again they tend to over-estimate performance.

Cite

CITATION STYLE

APA

Young, S., Breslin, C., Gašić, M., Henderson, M., Kim, D., Szummer, M., … Hancock, E. T. (2016). Evaluation of Statistical POMDP-Based Dialogue Systems in Noisy Environments (pp. 3–14). https://doi.org/10.1007/978-3-319-21834-2_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free