UltraSpeech: Speech Enhancement by Interaction between Ultrasound and Speech

21Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Speech enhancement can benefit lots of practical voice-based interaction applications, where the goal is to generate clean speech from noisy ambient conditions. This paper presents a practical design, namely UltraSpeech, to enhance speech by exploring the correlation between the ultrasound (profiled articulatory gestures) and speech. UltraSpeech uses a commodity smartphone to emit the ultrasound and collect the composed acoustic signal for analysis. We design a complex masking framework to deal with complex-valued spectrograms, incorporating the magnitude and phase rectification of speech simultaneously. We further introduce an interaction module to share information between ultrasound and speech two branches and thus enhance their discrimination capabilities. Extensive experiments demonstrate that UltraSpeech increases the Scale Invariant SDR by 12dB, improves the speech intelligibility and quality effectively, and is capable to generalize to unknown speakers.

Cite

CITATION STYLE

APA

Ding, H., Wang, Y., Li, H., Zhao, C., Wang, G., Xi, W., & Zhao, J. (2022). UltraSpeech: Speech Enhancement by Interaction between Ultrasound and Speech. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6(3). https://doi.org/10.1145/3550303

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free