We propose an efficient online learning method of dialogue management based on Bayes risk criterion for document retrieval systems with a speech interface. The system has several choices in generating responses. So far, we have optimized the selection as minimization of Bayes risk based on reward for correct information presentation and penalty for redundant turns. In this chapter, this framework is extended to be trainable via online learning by maximum likelihood estimation of success probability of a response generation. Effectiveness of the proposed framework was demonstrated through an experiment with a large amount of utterances of real users. The online learning method was then compared with the method using reinforcement learning and discussed in terms of convergence speed.
CITATION STYLE
Misu, T., Sugiura, K., Kawahara, T., Ohtake, K., Hori, C., Kashioka, H., & Nakamura, S. (2011). Online Learning of Bayes Risk-Based Optimization of Dialogue Management for Document Retrieval Systems with Speech Interface. In Spoken Dialogue Systems Technology and Design (pp. 29–52). Springer New York. https://doi.org/10.1007/978-1-4419-7934-6_2
Mendeley helps you to discover research relevant for your work.