Deep neural networks for shimmer approximation in synthesized audio signal

Mario Alejandro García; Eduardo Atilio Destéfanis

Conference Proceedings

Deep neural networks for shimmer approximation in synthesized audio signal

Communications in Computer and Information Science (2018) 790 3-12

DOI: 10.1007/978-3-319-75214-3_1

1Citations

4Readers

Get full text

Abstract

Shimmer is a classical acoustic measure of the amplitude perturbation in a signal. This kind of variation in the human voice allows to characterize some properties, not only of the voice itself, but of the person who speaks. During the last years deep learning techniques have become the state of the art for recognition tasks on the voice. In this work the relationship between shimmer and deep neural networks is analyzed. A deep learning model is created. It is able to approximate shimmer value of a simple synthesized audio signal (stationary and without formants) taking the spectrogram as input feature. It is concluded firstly, that for this kind of synthesized signal, a neural network like the one we proposed can approximate shimmer, and secondly, that the convolution layers can be designed in order to preserve the information of shimmer and transmit it to the following layers.

Author supplied keywords

Cite

CITATION STYLE

APA

García, M. A., & Destéfanis, E. A. (2018). Deep neural networks for shimmer approximation in synthesized audio signal. In Communications in Computer and Information Science (Vol. 790, pp. 3–12). Springer Verlag. https://doi.org/10.1007/978-3-319-75214-3_1

Deep neural networks for shimmer approximation in synthesized audio signal

Abstract

Author supplied keywords

Cite

Register to see more suggestions