Neural Style Transfer Based Voice Mimicking for Personalized Audio Stories

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper demonstrates a CNN based neural style transfer on audio dataset to make storytelling a personalized experience by asking users to record a few sentences that are used to mimic their voice. User audios are converted to spectrograms, the style of which is transferred to the spectrogram of a base voice narrating the story. This neural style transfer is similar to the style transfer on images. This approach stands out as it needs a small dataset and therefore, also takes less time to train the model. This project is intended specifically for children who prefer digital interaction and are also increasingly leaving behind the storytelling culture and for working parents who are not able to spend enough time with their children. By using a parent's initial recording to narrate a given story, it is designed to serve as a conjunction between storytelling and screen-Time to incorporate children's interest through the implicit ethical themes of the stories, connecting children to their loved ones simultaneously ensuring an innocuous and meaningful learning experience.

Cite

CITATION STYLE

APA

Fatima, S. M., Shehzad, M., Murtuza, S. S., & Raza, S. S. (2020). Neural Style Transfer Based Voice Mimicking for Personalized Audio Stories. In AI4TV 2020 - Proceedings of the 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery (pp. 11–16). Association for Computing Machinery, Inc. https://doi.org/10.1145/3422839.3423063

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free