Speaker recognition method for short utterance

3Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Speaker recognition is a technology that uses identity information in the human voice for identity recognition, which owns many advantages in convenient information gathering, low gathering cost and high recognition accuracy. However, the difficulty in gathering messages within short utterance declines the voiceprint recognition function rapidly. We propose a recognition model based on SincNet in the aim of obtaining enough feature information in short utterance. The model used a set of learnable Sinc-based filter banks to extract feature directly from primordial voice in featured extraction layer, which enabled neural networks to discover more valuable voiceprint information; In the pooling layer, we designed the pooling method of dual attention mechanism, which combined multiple self-attention mechanism and self-attention mechanism to enrich the feature information and enhance the differentiation degree of key features so as to solve the defect of short speech with less information; choose ArcFace as the loss function, which can maximize the classification limit in the Angle space, thus improving the classification ability of the model. Experimental results demonstrate that the proposed model performs better than the benchmark model.

Cite

CITATION STYLE

APA

Guo, M., Yang, J., & Gao, S. (2021). Speaker recognition method for short utterance. In Journal of Physics: Conference Series (Vol. 1827). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1827/1/012158

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free