Double Articulation Analyzer With Prosody for Unsupervised Word and Phone Discovery

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Word and phone discovery are important tasks in the language development of human infants. Infants acquire words and phones from unsegmented speech signals using segmentation cues, such as distributional, prosodic, and co-occurrence information. Many pre-existing computational models designed to represent this process tend to focus on distributional or prosodic cues. In this study, we propose a nonparametric Bayesian probabilistic generative model called the prosodic hierarchical Dirichlet process-hidden language model (prosodic HDP-HLM) designed to perform simultaneous phone and word discovery from continuous speech signals encoded as time-series data that may exhibit a double articulation structure. Prosodic HDP-HLM, as an extension of HDP-HLM, considers both prosodic and distributional cues within a single integrative generative model. We further propose a prosodic double articulation analyzer (Prosodic DAA) based on an inference procedure derived for prosodic HDP-HLM. We conducted three experiments on different types of data sets, including, Japanese vowel sequence, utterances for teaching object names and features, and utterances following Zipf's law, and the results demonstrated the validity of the proposed method. The results show that the Prosodic DAA successfully used prosodic cues and was able to discover words directly from continuous human speech using distributional and prosodic information in an unsupervised manner, outperforming a method that solely used distributional cues. In contrast, the phone discovery performance did not improve. We also show that prosodic cues contributed to word discovery performance more when the word frequency was distributed more naturally, i.e., following Zipf's law.

Cite

CITATION STYLE

APA

Okuda, Y., Ozaki, R., Komura, S., & Taniguchi, T. (2023). Double Articulation Analyzer With Prosody for Unsupervised Word and Phone Discovery. IEEE Transactions on Cognitive and Developmental Systems, 15(3), 1335–1347. https://doi.org/10.1109/TCDS.2022.3210751

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free