Characterizing and recognizing spoken corrections in human-computer dialogue

50Citations
Citations of this article
99Readers
Mendeley users who have this article in their library.

Abstract

Miscommunication in speech recognition systems is unavoidable, but a detailed characterization of user corrections will enable speech systems to identify when a correction is taking place and to more accurately recognize the content of correction utterances. In this paper we investigate the adaptations of users when they encounter recognition errors in interactions with a voice-in/voice-out spoken language system. In analyzing more than 300 pairs of original and repeat correction utterances, matched on speaker and lexical content, we found overall increases in both utterance and pause duration from original to correction. Interestingly, corrections of misrecognition errors (CME) exhibited significantly heightened pitch variability, while corrections of rejection errors (CRE) showed only a small but significant decrease in pitch minimum. CME's demonstrated much greater increases in measures of duration and pitch variability than CRE's. These contrasts allow the development of decision trees which distinguish CME's from CRE's and from original inputs at 70-75% accuracy based on duration, pitch, and amplitude features.

Cite

CITATION STYLE

APA

Levow, G. A. (1998). Characterizing and recognizing spoken corrections in human-computer dialogue. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 736–742). Association for Computational Linguistics (ACL). https://doi.org/10.3115/980845.980969

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free