Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models

32Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

This paper presents a novel method of enhancing esophageal speech using statistical voice conversion. Esophageal speech is one of the alternative speaking inethods for laryngectomees. Although it doesn't require any external devices, generated voices usually sound unnatural compared with normal speech. To improve the intelligibility and naturalness of esophageal speech, we propose a voice conversion inethod from esophageal speech into normal speech. A spectral parameter and excitation parameters of target normal speech are separately estimated froin a spectral parameter of the esophageal speech based on Gaussian mixture models. The experimental results demonstrate that the proposed method yields significant improvements in intelligibility and naturalness. We also apply one-to-many eigenvoice conversion to esophageal speech enhancement to make it possible to flexibly control the voice quality of enhanced speech. Copyright © 2010 The Institute of Electronics, Information and Communication Engineers.

Cite

CITATION STYLE

APA

Doi, H., Nakamura, K., Toda, T., Saruwatari, H., & Shikano, K. (2010). Esophageal speech enhancement based on statistical voice conversion with gaussian mixture models. IEICE Transactions on Information and Systems, E93-D(9), 2472–2482. https://doi.org/10.1587/transinf.E93.D.2472

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free