Combining prosodic and text features for segmentation of Mandarin broadcast news

1Citations
Citations of this article
79Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Automatic topic segmentation, separation of a discourse stream into its constituent stories or topics, is a necessary preprocessing step for applications such as information retrieval, anaphora resolution, and summarization. While significant progress has been made in this area for text sources and for English audio sources, little work has been done in automatic, acoustic feature-based segmentation of other languages. In this paper, we consider exploiting both prosodic and text-based features for topic segmentation of Mandarin Chinese. As a tone language, Mandarin presents special challenges for applicability of intonation-based techniques, since the pitch contour is also used to establish lexical identity. We demonstrate that intonational cues such as reduction in pitch and intensity at topic boundaries and increase in duration and pause still provide significant contrasts in Mandarin Chinese. We build a decision tree classifier that, based only on word and local context prosodic information without reference to term similarity, cue phrase, or sentence-level information, achieves boundary classification accuracy of 84.6-95.6% on a balanced test set. We contrast these results with classification using text-based features, exploiting both text similarity and n-gram cues, to achieve accuracies between 77-95.6%, if silence features are used. Finally we integrate prosody, text, and silence features using a voting strategy to combine decision tree classifiers for each feature subset individually and all subsets jointly. This voted decision tree classifier yields an overall classification accuracy of 96.85%, with 2.8% miss and 3.15% false alarm rates on a representative corpus sample, demonstrating synergistic combination of prosodic and text features for topic segmentation.

Cite

CITATION STYLE

APA

Levow, G. A. (2004). Combining prosodic and text features for segmentation of Mandarin broadcast news. In Proceedings of the 3rd SIGHAN Workshop on Chinese Language Processing, SIGHAN@ACL 2004 - Held in cooperation with ACL 2004 (pp. 102–108). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1626307.1626313

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free