Text-Independent Speaker Recognition Using Deep Learning

Smriti Srivastava; Gopal Chaudhary; Chandrakesh Shukla

Book Chapter

Text-Independent Speaker Recognition Using Deep Learning

Springer Science and Business Media Deutschland GmbH, (2021), 41-51

DOI: 10.1007/978-3-030-76167-7_2

0Citations

1Readers

Get full text

Abstract

Speaker recognition is the process of recognizing the speaker by using speaker-specific information. A speaker recognition system can be classified into text-dependent speaker recognition and text-independent speaker recognition systems. In a text-dependent system, the recognition phrases are fixed (known beforehand). The user can be prompted to read a randomly selected sequence of numbers. However, in a text-independent speaker recognition system, there are no constraints on the words which the speakers are allowed to use. What is spoken in training and what is uttered in actual use may have completely different content. The entire domain of speaker recognition can be further categorized into speaker identification and speaker verification. Speaker verification evaluates whether the voice belongs to some person, while speaker identification tries to find out the person it belongs to. In this paper, Mel-frequency cepstral coefficients (MFCC) were extracted from the audio files. These features were then fed a convolutional neural network (CNN). This CNN was then optimized in order to increase model accuracy. Over the span of six runs of varying parameters, a maximum accuracy of approx. 97% was achieved.

Author supplied keywords

Cite

CITATION STYLE

APA

Srivastava, S., Chaudhary, G., & Shukla, C. (2021). Text-Independent Speaker Recognition Using Deep Learning. In EAI/Springer Innovations in Communication and Computing (pp. 41–51). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-76167-7_2

Text-Independent Speaker Recognition Using Deep Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions