We study the problem of detecting linguistic events at interword boundaries, such as sentence boundaries and disfluency locations, in speech transcribed by an automatic recognizer. Recovering such events is crucial to facilitate speech understanding and other natural language processing tasks. Our approach is based on a combination of prosodic cues modeled by decision trees, and word-based event N-gram language models. Several model com-bination approaches are investigated. The techniques are eval-uated on conversational speech from the Switchboard corpus. Model combination is shown to give a significant win over in-dividual knowledge sources.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below