Voice-activity and overlapped speech detection using x-vectors

Jiří Málek; Jindřich Žďánský

Conference Proceedings

Voice-activity and overlapped speech detection using x-vectors

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12284 LNAI 366-376

DOI: 10.1007/978-3-030-58323-1_40

5Citations

2Readers

Get full text

Abstract

The x-vectors are features extracted from speech signals using pretrained deep neural networks, such that they discriminate well among different speakers. Their main application lies in speaker identification and verification. This manuscript studies, which other properties are encoded in x-vectors. The focus lies on distinguishing between speech signals/noise and utterances of a single speaker versus overlapped-speech. We attempt to show that the x-vector network is capable to extract multi-purpose features, which can be used by several simple back-end classifiers. This means a common feature extracting front-end for the tasks of voice-activity/overlapped speech detection and speaker identification. Compared to the alternative strategy, that is training of independent classifiers including feature extracting layers for each of the tasks, the common front-end saves computational time during both training and test phase.

Author supplied keywords

Cite

CITATION STYLE

APA

Málek, J., & Žďánský, J. (2020). Voice-activity and overlapped speech detection using x-vectors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12284 LNAI, pp. 366–376). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58323-1_40

Voice-activity and overlapped speech detection using x-vectors

Abstract

Author supplied keywords

Cite

Register to see more suggestions