Korean MULTEXT: A korean prosody corpus

Sunhee Kim; Daniel Hirst; Hyongsil Cho; Ho Young Lee; Minhwa Chung

Conference Proceedings

Korean MULTEXT: A korean prosody corpus

Proceedings of the 4th International Conference on Speech Prosody, SP 2008 (2008) 139-142

DOI: 10.21437/speechprosody.2008-33

4Citations

5Readers

Get full text

Abstract

This paper describes the contents of the Korean prosody corpus (Korean MULTEXT), which is a Korean version of the speech database Eurom1. The corpus consists of about 2 hours of read speech, transcribed primarily in orthography (in Korean alphabet and in a Romanized transcription), in IPA and in SAMPA. Furthermore, it includes the original F0 values, stylized F0 values extracted using Momel, and hand-corrected F0 values. The prosodic events are annotated in two ways. They are annotated with the automatic annotation algorithm, INTSINT, and also labeled manually into prosodic units with two tones on the hand-corrected pitch targets. It is found that the resulting tone patterns from the proposed Momel-based two tone labeling correspond to those defined in K-ToBI.

Cite

CITATION STYLE

APA

Kim, S., Hirst, D., Cho, H., Lee, H. Y., & Chung, M. (2008). Korean MULTEXT: A korean prosody corpus. In Proceedings of the 4th International Conference on Speech Prosody, SP 2008 (pp. 139–142). International Speech Communications Association. https://doi.org/10.21437/speechprosody.2008-33

Korean MULTEXT: A korean prosody corpus

Abstract

Cite

Register to see more suggestions